v4.12.0 has unusable performance #3280

pnorman · 2018-06-26T17:46:31Z

We now have confirmation from @woodpeck that at z10 average render times have jumped from 1.5s with 4.11.0 to 33.7s with the 4.12.0 stylesheet.

This is slow enough to the point of unusable, and right now I'm recommending that people do not upgrade to v4.12.0.

The issue is one with the surface changes. If we can't reasonably quickly figure out what the problem is, we should revert the PR and release v4.12.1.

I will not have time to step through the SQL to identify problems in the near future.

pnorman · 2018-06-26T17:48:29Z

#2640 is the PR that appears to have introduced the problems.

kocio-pl · 2018-06-26T18:00:36Z

Interesting thing is that on z16+ (where I usually test things) I have not seen so visible problems - it looked a bit suspicious, but not anything clear enough to claim performance problems.

imagico · 2018-06-26T18:20:21Z

If reverting #2640 indeed fixes the performance issues i would suggest to do that and review the change regarding what causes this without pressure. There is also #3279 which i would assume to be caused by the same change so a more thorough review overall might not be such a bad idea. The documentation now written by @sommerluk (https://github.com/sommerluk/roadpatternrendering) should help with that.

If a 20x performance loss is not recognized during the ~6 weeks this change was already in master this should however also be an indicator that some changes in procedure for better change evaluation and pre-release checks might not be such a bad idea.

kocio-pl · 2018-06-26T19:25:17Z

Sounds OK for me. Could you to take care of the revert and v4.12.1 release?

As of performance: we have special ticket for this, yet nobody came with a performance framework solution idea (#1287). When you have 20 zoom levels and the problem is visible only on some of them, manual checking is not an option IMO. Do you have some suggestion or solution for this?

sommerluk · 2018-06-26T19:33:40Z

Bad thing…

Up to z9, the layer roads-low-zoom is used. Starting from z10, we have roads-fill and bridges. Following the information at openstreetmap/chef#168 there is no problem at z0-z9. To roads-low-zoom should be okay. Finally, roads-fill is not so different from roads-low-zoom, so I suppose that bridges is the real problem, which is the one that has a SELECT DISTINCT in the SQL code. And in openstreetmap/chef#168 @woodpeck said that this is the query that runs really often. We could try to revert only the bridges SQL. (If support for “surface” on bridges is useful at all, was subject of discussion in the PR anyway.)…

matthijsmelissen · 2018-06-26T19:41:12Z

We could try to revert only the bridges SQL.

Our first priority now is to get master stable again. We can debug later, but for now we should revert entire #2640.

imagico · 2018-06-26T19:53:02Z

@sommerluk - you could try the version i showed in #2640 (comment) or the CTE variant i sketched earlier in #2640 (comment). For just testing if the problem is indeed in that part you can also just test a static list of layer values.

But you'd need to reliably check the performance in some way. And yes, i think it is better to do that without a hurry due to the need to prepare a bugfix release.

@kocio-pl - No, i have not contributed or merged any changes for quite some time so i am not familiar with the more recent changes and would not be comfortable doing a revert & release.

mboeringa · 2018-06-26T20:23:41Z

One thing that does stand out is also that this issue seems to be mainly on medium / low zoom. The more zoomed out you are, the less effective any spatial index is, and the more PostgreSQL will need to rely on its attribute indexes to prevent large scans over tables and many records.

Looking at the list of default carto suggested indexes, it strikes me the current list of indexes is really conservative, with only an absolute minimal number of indexes defined considering the number of keys used in a style like this. That saves a lot of disk space, but might affect low zoom rendering performance.

Maybe indexing a few more attributes / keys involved in the SQL, could really help boost performance. It currently seems that much of the rendering performance relies on the spatial index doing the main job, which is logical at zoomed in extent, but less so zoomed out.

E.g. in a really bad hypothetical situation zoomed out to Z0-Z2, you might end up doing a full table scan on 300M records to only fetch 10 or so..., just because there is no attribute index defined that could help selecting those few records required by the SQL.

woodpeck · 2018-06-26T21:51:00Z

On a general note, doing a full table scan on z0-z2 would be a negligible performance hit since these zoom levels only contain one single meta tile. On the other hand, having to keep a spatial index up to date affects every single database update, so that has to be kept in mind.

In the concrete case at hand, as I wrote in openstreetmap/chef#168, there appears to be a full table scan on planet_osm_line going on on z10. Looking further into it, it becomes obvious that the SELECT DISTINCT layernotnull subquery is at fault. I can repair the problem by adding a ... AND "way" && ST_SetSRID(...) condition to that subquery (the sample query then takes only a few milliseconds as opposed ~40 seconds). This means you can likely fix the issue by adding a && !bbox! at that location, however I'm at the moment not sure if having one of these means that Mapnik will cease adding the spatial condition at the end, which would mean you need a second && !bbox! at the end of the query.

pnorman · 2018-06-26T22:07:21Z

On a general note, doing a full table scan on z0-z2 would be a negligible performance hit since these zoom levels only contain one single meta tile. On the other hand, having to keep a spatial index up to date affects every single database update, so that has to be kept in mind.

Unless there's some kind of partial index (e.g. WHERE boundary='administrative' AND admin_level IN ('2','3','4')) that is very selective, a sequential scan of the entire table will be faster than an index scan, and postgres will correctly chose to do it. The same is true on z3 and possibly z4.

Slow rendering within reason doesn't matter for z0-z4, because there are a total of 33 metatiles to be rendered. There are 16k z10 metatiles.

sommerluk · 2018-06-27T04:26:50Z

I will not oppose a revert…

giggls · 2018-06-27T08:47:44Z

I can not confirm the suspicion that this is not happening on zoom-levels 16+
Here is what I get in zoomlevel 19 rendering a single tile only:

$ time render_single_tile.py -s osm.xml -o test4120.png -u /19/274450/180045.png
real	8m2.860s
user	0m1.448s
sys	0m0.472s

$ time render_single_tile.py -s osm.xml -o test4110.png -u /19/274450/180045.png
real	0m14.980s
user	0m1.208s
sys	0m0.400s

kocio-pl · 2018-06-27T10:11:46Z

Thanks for checking it, Sven!

This proves that automated testing framework is needed anyway. I was able to detect such problem before v4.10.0 (see #3159 (comment)), but not this time. I think that tiles I was using were not so slow to render, but still we need something more solid than just my gut feeling.

matthijsmelissen · 2018-06-27T10:22:47Z

I could not reproduce this on z19.

This was with a small database and old indexes. Perhaps the issue only shows up on databases of reasonable size?

giggls · 2018-06-27T12:59:58Z

I did not change anything database related. I just called the single tile rendering script with both versions of the style. However this might be related to my hstore-only database layout.

sommerluk · 2018-06-28T07:29:54Z

@pnorman Thanks.

kocio-pl · 2018-06-28T10:00:02Z

Does it mean we can release v4.12.1 now? Could anyone do the actual release? I'm a bit busy with other things lately.

kocio-pl · 2018-06-28T18:35:22Z

And before the release - @giggls @woodpeck , could you just check if now the problem is fixed?

giggls · 2018-06-28T18:37:53Z

Current master works fine again.

mboeringa · 2018-07-01T11:08:53Z

Unless there's some kind of partial index (e.g. WHERE boundary='administrative' AND admin_level IN ('2','3','4')) that is very selective, a sequential scan of the entire table will be faster than an index scan, and postgres will correctly chose to do it. The same is true on z3 and possibly z4.

I think this is a bit to generic a statement. A sequential scan does have its cost (and a considerable one) as well. You do not need to have only very specific partial indexes to take advantage of indexing. In my experience, at least one important thing to look for are OR statements in your queries, and all keys separated by them.

E.g. look at the query below, which is quite typical for situations you may encounter in OSM, where due to tagging conflation and changes in tagging schemes, multiple keys may be needed to extract a specific thematic layer.

SELECT
	*
FROM
	gis.public.planet_osm_polygon_osv
WHERE
	( osm_man_made IN ('mineshaft')
	OR osm_disused_58_man_made IN ('mineshaft')
	OR osm_abandoned_58_man_made IN ('mineshaft'))
	AND ( osm_mineshaft_type IS NULL
	OR osm_mineshaft_type IN ('winding'))

Now look at the image below, which shows the query execution plan with just the osm_man_made field having an index. Unsurprisingly, this ends up in a sequential scan, because the OR clauses causes the need to evaluate the entire table for the osm_disused_58_man_made and osm_abandoned_58_man_made keys as well, which is fastest in a sequential scan if no other indexes are available. This does, however, come at a cost (this is the entire Europe extract by the way). Extracting just 612 records from a >165M record materialized view takes 165 seconds, so almost 3 minutes using the sequential scan (using SSD storage).

Now watch what happens when I do add indexes for the osm_disused_58_man_made and osm_abandoned_58_man_made fields as well:

Notice the time is now down to just 152 ms - so less than one second to extract the 612 records - using the three key indexes added, and PostgreSQL using all indexes to get them, instead of a sequential scan.

Of course, there are costs to indexing all of these fields in terms of disk space and maintaining them, especially in the environment with constant diff updates that openstreetmap-carto seems to operate in, so this is just an illustration to show that in specific cases it may still be beneficial to add indexes to solve a performance issue, although this all needs to be seen in the context of a primarily spatially accessed database as well, which diversifies the situation even more. Ultimately though, the best thing to do, is to simply test it for a specific query if it has a performance issue. Add the index(es) you think are relevant, and see how it does.

sommerluk · 2018-07-08T08:29:43Z

Okay, I’ve played around a little bit with the code, and honestly I do not understand the results. I’ve fixed #3279 in the code for unpaved rendering. Than I’ve loaded a current Ivory Coast extract in the database and than rendering with Kosmtik. I’ve taken the time until all times in the browser window are loaded. Doing rendering starting at z19 up to z0 gives these results:

zoom	4.12.1 (seconds)	with pattern (seconds)
19	12	17
18	4	8
17	3	16
16	3	9
15	3	10
14	6	25
13	14	55
12	11	40
11	16	65
10	12	60
9	4	6
8	5	13
7	7	8
6	6	6
5	10	6
4	4	5
3	4	4
2	3	4
1	2	2
0	2	1

The results are not very exact. Repeating the same steps, they can vary up to 100%.

Anyway, the real problem seems to start with z10 and higher.

Given that the reports that that the SQL is slower by a factor 20 and that I see here at z10 only a factor 5, I suppose that the SQL execution time is less important and the Mapnik rendering more important if the database is small (I have only Ivory Coast loaded), but that this will change when the database is big (whole planet)?

Now I have tried to isolate what is so time-consuming. Up to z9, we use the roads-low-zoom SQL query. Up from z10, we use the roads-fill SQL query and the corresponding bridges and tunnels SQL query. All these (roads-low-zoom, roads-fill, bridges…) are catched by the same code in roads.mss. I’ve called fixed coordinates at z20 in Kosmtik. Than I’ve switched directly to z10.

This takes about 20 seconds for v4.12.1 (significantly slower than the 12 seconds in the table above going from z9 to z10).
With the unpaved pattern, it takes 60 seconds.
Interesing: With roads.mss from the unpaved pattern but project.mml fro v4.12.1 (just with some NULL AS int_surface so that the code does not crash) it still takes 40 seconds in spite that nothing of the pattern is rendered. Even when I replace all surface-based code in project.mml with dummy NULL AS int_surface, it still takes 40 seconds. z0-z9 however work as fast as v4.12.1 does. That’s something I do not understand. The SQL is the same as in v4.12.1. It renders the same as v4.12.1. But it’s slower. The only difference I see are the attachment order in roads.mss. But can this really be the reason? Because than it should give performance problems also at z9. What’s so different at z10? As my modified v4.12.1 SQL query does not return any globalboundingbox objects nor any surface values, nothing of the comp-op operations is really triggered.

imagico · 2018-07-08T11:10:50Z

I created two sketches for possible solutions based on the 4.12.0 code:

https://github.com/imagico/osm-carto-alternative-colors/tree/surface-fix1
https://github.com/imagico/osm-carto-alternative-colors/tree/surface-fix2

The first led to a ~80 times performance improvement for the query alone on a small test database, the second about 100 times faster.

Note i did not test if this is functionally the same, i.e. if it leads to the same rendering results. It should but none the less this should be tested. I have also not tested if there are other performance issues - this is just based on #3280 (comment) (which is pretty convincing)

pnorman · 2018-07-10T19:04:14Z

I think this is a bit to generic a statement

The statement you quoted is about z3 and z4 rendering. The rest of your comment is not really relevant to roads.

I created two sketches for possible solutions based on the 4.12.0 code:

Diffs are
v4.12.1...imagico:surface-fix1
v4.12.1...imagico:surface-fix2

sommerluk · 2018-07-20T17:02:12Z

Okay. At https://github.com/sommerluk/openstreetmap-carto/tree/unpaved20 the code for unpaved rendering, fixes #110 and does not re-introduce #3279 and has imagico:surface-fix2 applied. It still feels slow on z10-z13, but faster on the higher zoom levels. How could I measure the performance in a more objective way?

imagico · 2018-07-20T17:19:28Z

Easiest way to benchmark queries (AFAIK) is to enable extended logging in postgres - as explained in:

https://gis.stackexchange.com/questions/60448/making-mapnik-verbose

You can also take specific queries from those logs and EXPLAIN ANALYZE them - as @woodpeck did.

giggls · 2018-07-20T17:30:55Z

As an addition to this comment:
The output of EXPLAIN ANALYZE will likely be OK in any other case than "full table scan". The perfect output would be "Index Scan" using on of the geospatial indexes in our case.

kocio-pl · 2018-07-20T17:58:38Z

Maybe render_speedtest might be useful here, see #1287 (comment).

sommerluk · 2018-07-21T12:01:25Z

Okay, I need some more help with SQL setup. I don’t get working postgres -D /path/to/postgres/directory -c log_min_duration_statement=0.

Usually, I start PostGres with sudo systemctl start postgresql. I can find the path to my config file with # SHOW config_file; but when trying postgres -D /path/to/postgres/directory -c log_min_duration_statement=0 (as normal user) than it complains that I do not have access rights to the file. If I do the same as root user, than postgres informs that it cannot be executed as root user for security reasons. I suppose another type of PostGres setup is needed?

imagico · 2018-07-21T12:17:56Z

You need to run postgres as the dedicated user created for that (usually called postgres). You stop the normal service running and then manually start it using sudo. Like:

sudo -u postgres /usr/lib/postgresql96/bin/postgres -D /var/lib/pgsql/data -c log_min_duration_statement=0 -c log_destination=stderr -c logging_collector=false 2> debug.log

sommerluk · 2018-07-21T17:31:27Z

@imagico Thanks a lot! That works!

The result of EXPLAIN ANALYSE for the query with fix2 from @imagico (bbox is copy&paste from the Postgres log from a Kosmtik run at z10) is:

                                                                                                                                                                                                                                                                                                                                               QUERY PLAN                                                                                                                                                                                                                                                                                                                                               
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=12176.48..12176.49 rows=1 width=358) (actual time=0.001..0.001 rows=0 loops=1)
   ->  Subquery Scan on bridges  (cost=12176.48..12185.33 rows=708 width=358) (never executed)
         ->  Sort  (cost=12176.48..12178.25 rows=708 width=386) (never executed)
               Sort Key: features.layernotnull, features.z_order, (CASE WHEN ("substring"(features.feature, 1, 8) = 'railway_'::text) THEN 2 ELSE 1 END), (CASE WHEN (features.feature = ANY ('{railway_INT-preserved-ssy,railway_INT-spur-siding-yard,railway_tram-service}'::text[])) THEN 0 ELSE 1 END), (CASE WHEN (features.feature = 'highway_road'::text) THEN 0 ELSE 1 END), (CASE WHEN (features.feature = ANY ('{highway_platform,railway_platform}'::text[])) THEN 0 ELSE 1 END), (CASE WHEN (features.access = ANY ('{no,private}'::text[])) THEN 0 WHEN (features.access = 'destination'::text) THEN 1 ELSE 2 END), (CASE WHEN (features.int_surface = 'unpaved'::text) THEN 0 ELSE 2 END)
               ->  Subquery Scan on features  (cost=6055.92..12142.96 rows=708 width=386) (never executed)
                     ->  Append  (cost=6055.92..12108.45 rows=708 width=362) (never executed)
                           CTE features_real
                             ->  Seq Scan on planet_osm_line planet_osm_line_1  (cost=0.00..6055.92 rows=504 width=559) (never executed)
                                   Filter: ((highway IS NOT NULL) AND (way && '01030000A0110F00000100000005000000010000E0FFFFEFC7010000E0FFFFEFC70000000000000000010000E0FFFFEFC7010000E0FFFFEF470000000000000000010000E0FFFFEF47010000E0FFFFEF470000000000000000010000E0FFFFEF47010000E0FFFFEFC70000000000000000010000E0FFFFEFC7010000E0FFFFEFC70000000000000000'::geometry) AND (bridge = ANY ('{yes,boardwalk,cantilever,covered,low_water_crossing,movable,trestle,viaduct}'::text[])))
                           ->  CTE Scan on features_real  (cost=0.00..10.08 rows=504 width=360) (never executed)
                           ->  Subquery Scan on layers_list  (cost=11.34..15.34 rows=200 width=360) (never executed)
                                 ->  HashAggregate  (cost=11.34..13.34 rows=200 width=4) (never executed)
                                       Group Key: features_real_1.layernotnull
                                       ->  CTE Scan on features_real features_real_1  (cost=0.00..10.08 rows=504 width=4) (never executed)
                           ->  Seq Scan on planet_osm_line  (cost=0.00..6027.06 rows=4 width=559) (never executed)
                                 Filter: ((railway IS NOT NULL) AND (way && '01030000A0110F00000100000005000000010000E0FFFFEFC7010000E0FFFFEFC70000000000000000010000E0FFFFEFC7010000E0FFFFEF470000000000000000010000E0FFFFEF47010000E0FFFFEF470000000000000000010000E0FFFFEF47010000E0FFFFEFC70000000000000000010000E0FFFFEFC7010000E0FFFFEFC70000000000000000'::geometry) AND (bridge = ANY ('{yes,boardwalk,cantilever,covered,low_water_crossing,movable,trestle,viaduct}'::text[])))
 Planning time: 1.608 ms
 Execution time: 0.104 ms

sommerluk · 2018-07-21T17:33:47Z

The database of the above example had loaded Ivory Coast only.

imagico · 2018-07-21T18:50:13Z

I think you got the wrong query, probably something Kosmtik is running as an advance test. You need a query from rendering an actual tile with a real bounding box and without a zero limit. Preferably from a tile where the query returns some data.

And then you can compare the performance with and without the change at various zoom levels.

This will not be quantitatively representative for a full database but it will tell you if there are any serious issues.

giggls · 2018-07-21T22:43:13Z

Maybe my render_single_tile.py https://github.com/giggls/openstreetmap-carto-de/blob/master/scripts/render_single_tile.py script might be useful for you. I use this script in combination with Makefile targets to do manual tests when merging upstream changes.

sommerluk · 2018-08-22T12:29:12Z

New PR at #3357

@giggls Thanks for the render_single_tile.py script. That’s nice! Could you take a look on performance of #3357 ?

giggls · 2018-08-22T14:24:49Z

My makefile will probably also be useful for you then:
https://github.com/giggls/openstreetmap-carto-de/blob/upstream%2Bl10n/Makefile
Rendering time for running "make test" is roughly the same for me in your branch and origin/master now.

kocio-pl · 2018-08-22T14:33:59Z

I'm curious what are the numbers in you case? Are they comparable with #3355?

giggls · 2018-08-22T15:44:08Z

Running "make test" which will render one single tile per zoom layer in my setup took 5m10s in master and unpaved20 branches (second pass, thus query cache filled).
In comparison rendering v4.12.0 is still ongoing. I will post the results later but I would expect an order of magnitude.

The fact if there are full table scans or not should be more meaningful than rendering times anyway.
I'm using a slightly different database layout and views to be able to render upstream carto thus @rrzefox should be able to provide more comparable results.

giggls · 2018-08-22T18:25:49Z

v4.12.0 was 93 minutes

sommerluk · 2018-08-25T05:14:57Z

@Giggles Thanks a lot for testing this!

pnorman added regression performance roads labels Jun 26, 2018

pnorman mentioned this issue Jun 26, 2018

New openstreetmap-carto release, v4.12.0 openstreetmap/chef#168

Closed

pnorman mentioned this issue Jun 26, 2018

Release v4.12.0 #3248

Closed

kocio-pl added this to the Bugs and improvements milestone Jun 26, 2018

pnorman closed this as completed in 036f3cb Jun 27, 2018

sommerluk mentioned this issue Jul 8, 2018

Render “surface” tag on roads with a pattern #2640

Merged

sommerluk added a commit to sommerluk/openstreetmap-carto that referenced this issue Jul 20, 2018

Fixes gravitystorm#3280 - code by @imagico - idea by @woodpeck

6dc4a77

sommerluk added a commit to sommerluk/openstreetmap-carto that referenced this issue Jul 29, 2018

Fixes gravitystorm#3280 - code by @imagico - idea by @woodpeck

583e8c8

sommerluk added a commit to sommerluk/openstreetmap-carto that referenced this issue Aug 21, 2018

Fixes gravitystorm#3280 - code by @imagico - idea by @woodpeck

eb01f60

sommerluk mentioned this issue Aug 22, 2018

Render unpaved roads different #3357

Merged

sommerluk mentioned this issue Aug 25, 2018

How to render streets with a patters that behaves similar to “polygon-pattern”? mapnik/mapnik#3977

Closed

sommerluk added a commit to sommerluk/openstreetmap-carto that referenced this issue Sep 1, 2018

Fixes gravitystorm#3280 - code by @imagico - idea by @woodpeck

c3c1445

sommerluk added a commit to sommerluk/openstreetmap-carto that referenced this issue Sep 6, 2018

Fixes gravitystorm#3280 - code by @imagico - idea by @woodpeck

cbf2f65

sommerluk mentioned this issue Jan 29, 2019

Unpaved rendering #3399

Merged

v4.12.0 has unusable performance #3280

v4.12.0 has unusable performance #3280

Comments

pnorman commented Jun 26, 2018

pnorman commented Jun 26, 2018

kocio-pl commented Jun 26, 2018

imagico commented Jun 26, 2018

kocio-pl commented Jun 26, 2018 • edited Loading

sommerluk commented Jun 26, 2018

matthijsmelissen commented Jun 26, 2018

imagico commented Jun 26, 2018

mboeringa commented Jun 26, 2018 • edited Loading

woodpeck commented Jun 26, 2018

pnorman commented Jun 26, 2018

sommerluk commented Jun 27, 2018

giggls commented Jun 27, 2018

kocio-pl commented Jun 27, 2018

matthijsmelissen commented Jun 27, 2018

giggls commented Jun 27, 2018

sommerluk commented Jun 28, 2018

kocio-pl commented Jun 28, 2018

kocio-pl commented Jun 28, 2018

giggls commented Jun 28, 2018

mboeringa commented Jul 1, 2018 • edited Loading

sommerluk commented Jul 8, 2018

imagico commented Jul 8, 2018

pnorman commented Jul 10, 2018

sommerluk commented Jul 20, 2018

imagico commented Jul 20, 2018

giggls commented Jul 20, 2018

kocio-pl commented Jul 20, 2018

sommerluk commented Jul 21, 2018

imagico commented Jul 21, 2018

sommerluk commented Jul 21, 2018

sommerluk commented Jul 21, 2018

imagico commented Jul 21, 2018

giggls commented Jul 21, 2018

sommerluk commented Aug 22, 2018

giggls commented Aug 22, 2018

kocio-pl commented Aug 22, 2018

giggls commented Aug 22, 2018

giggls commented Aug 22, 2018

sommerluk commented Aug 25, 2018

kocio-pl commented Jun 26, 2018 •

edited

Loading

mboeringa commented Jun 26, 2018 •

edited

Loading

mboeringa commented Jul 1, 2018 •

edited

Loading