Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

queryRenderFeatures continued [not ready] #2224

Merged
merged 48 commits into from
Mar 24, 2016

Conversation

ansis
Copy link
Contributor

@ansis ansis commented Mar 5, 2016

This builds on #2052.

This is not ready, but it could use some high level 👀. I probably should have gotten some 👀 a bit earlier. Let me know if any of the high level ideas here concern you. I'll comment again later when this is more polished.

The raw vector tile pbf, FeatureTree and CollisionTile are now transferred to the main thread when a tile finishes loading. This lets use have a synchronous queryRenderedFeatures, it lets us eliminate includeGeometry without a performance hit, and it starts moving us towards more state-less workers.

A synchronous queryRenderedFeatures

This adds a synchronous version of queryRenderedFeatures.

  • it's faster
  • it's more convenient to use, especially when you want to make multiple queries
  • it's more consistent. It never gets delayed for 100s of ms while a worker is busy parsing a tile
  • you don't have to worry about callbacks being called out of order
  • things like Add a "layers" parameter to "Evented#on"  #1002 could be done lazily

There is still an async version. It transfers data back to workers.

Despite being parallelized, the async version is not faster than the sync version. It ends up parsing from the pbfs twice instead of once. It spends less time on the main thread though which could be useful for large, expensive queries. I think it could be worthwhile to keep both versions as queryRenderedFeatures and queryRenderedFeaturesAsync

Getting rid of includeGeometry

The includeGeometry param for queryRenderedFeatures only exists because sending geometries from the worker back to the main thread was slow. When the pbf is on the main thread individual geometries can be deserialized lazily. This lets us drop includeGeometry without a performance penalty.

indexing with Grid

I replaced rbush with a Grid, a different kind of spatial index. Grid can be serialized to an ArrayBuffer to make it transferrable. The serialized version can be queried directly without deserializing individual elements to javascript objects. It is faster than rbush for collision detection and comparable for rendered feature querying. It's faster because insertions are faster.

Grid is a grid of cells. When a bbox is inserted it is added to all the cells it intersects. When a bbox is queried it looks in all cells that intersects with that bbox. That's it.

It isn't faster than rbush generally, but it is faster for what we use it for: a fixed extent, mostly evenly sized bboxes, and a high number of insertions. More importantly, it's transferable.

Storing arrays of "structs" in ArrayBuffers

There are a bunch of cases where we have giant arrays of fixed size objects. I added a new abstraction, StructArray, that makes it easier to store these kinds of objects in ArrayBuffers. It's inspired by the new Buffer implementation. It:

  • saves a lot of space. Switching collision boxes saved 500KB per tile.
  • makes transferring or even copying these arrays a lot faster
  • avoids creating a ton of garbage

see 9a2e5fd 's commit message for an example

Reducing worker state

Both Grid and StructArrays are transferrable and don't need to be deserialized to individual objects before they can be used. This makes moving them between workers and the main thread really fast and reduces the amount of data that is "stuck" on workers. When symbolInstances is made completely transferable we could get rid of WorkerTiles that live after parsing has finished.

The work requests would transfer all the data that is needed and not rely on the worker having any unique data. I think this is also more similar to how we want work requests to work in -native.

Since work won't be tied to a specific thread, it should be possible to distribute work more consistently and speed things up.

Benchmarks

Compared to #2052


point query
http://localhost:9966/debug/#15.37/38.9047/-77.0299
mapbox-streets-v6 (before all features were merged)
new sync                        1 ms
new sync with geometry          1 ms
new async                       5-6 ms
new async with geometry         5-6 ms
old async                       1-2 ms
old async with geometry         2 ms

point query
http://localhost:9966/debug/#15.37/38.9047/-77.0299
mapbox-streets-v7 (after many layers are merged into a single huge multi-geometry feature)
new sync                        2 ms
new sync with geometry          2 ms
new async                       5 ms
new async with geometry         5-6 ms
old async                       5 ms
old async with geometry         15-30 ms

full screen box query
http://localhost:9966/debug/#15.37/38.9047/-77.0299
mapbox-streets-v7 (after many layers are merged into a single huge multi-geometry feature)
new sync                        23 ms
new sync with geometry          70 ms
new async                       19-25 ms
new async with geometry         75 ms
old async                       45 ms
old async with geometry         770 ms


full screen bbox query
http://localhost:9966/debug/#11.52/38.9262/-77.0420
mapbox-streets-v7 (after many layers are merged into a single huge multi-geometry feature)
new sync                        115 ms
new sync with geometry          175 ms
new aysnc                       125 ms
new async with geometry         180 ms
old async                       265 ms
old async with geometry         560 ms
  • The old and new sync queryRenderedFeatures are both really fast for point queries.
  • The new queryRenderedFeatures is much faster for large box queries, especially if geometries are included.
  • The new sync and async queryRenderedFeatures take a similar amount of time, but the async version spends less time on the main thread.

next steps

Cleaning things up. Fixing tests. Fixing a couple remaining bugs.

Do all the high level changes sound ok?

@lucaswoj @jfirebaugh @mourner

@mourner
Copy link
Member

mourner commented Mar 7, 2016

What's the memory impact of switching to grid index and storing it in both the main thread and the worker?

@lucaswoj
Copy link
Contributor

lucaswoj commented Mar 7, 2016

I'm very 👍 on the high-level changes in this PR.

Before shipping, I'd like for us to think about

  • what it'd take to merge StructArray and Buffer (LOC is proportional to # of bugs)
  • how stateless workers can be used to write better integration tests for mapbox-gl-js (the tests can be another PR but lets think about the architecture that enables those tests now)
  • how we can codify the performance benchmarks used in this PR into something reusable

@ansis ansis force-pushed the query-rendered-features-continued branch from 9b3a74d to d342793 Compare March 7, 2016 22:26
@ansis
Copy link
Contributor Author

ansis commented Mar 8, 2016

What's the memory impact of switching to grid index and storing it in both the main thread and the worker?

Good question. I hadn't benchmarked that yet.

It is generally only stored in one thread at a time. Async queries create a duplicate but that duplicate is discarded at the end of the query. We could maintain a duplicate version to reduce the cpu cost, but the cpu cost is pretty low and async queries should only be used for expensive queries.

For each of the benchmarks I cropped the window so that it loaded only one tile. I used mapbox-streets-v6 because v7 merges features into huge multigeometries. This branch implements subfeature indexing (a bbox for each ring, not each feature) but the old one doesn't so it isn't a fair comparison when using v7.

At z12 with many lines and few labels.
http://localhost:9966/debug/#12.89/38.9188/-77.0388

FeatureTree with rbush          1843 KB         343 KB for the tree, 1500 KB for the bbox arrays that are inserted
CollisionTile with rbush        20 KB
CollisionBoxes

FeatureTree with Grid           340 KB          240 KB for the Grid, 100 KB for the featureIndexArray
CollisionTile with Grid         225 KB

At z17 with less lines but more labels.
http://localhost:9966/debug/#17.09/38.90398/-77.03116

FeatureTree with rbush          420 KB          50 KB for the tree, 270 KB for the bbox arrays that are inserted
CollisionTile with rbush        200 KB

FeatureTree with Grid           117 KB          100 KB for the Grid, 17 KB for the featureIndexArray
CollisionTile with Grid         280 KB


z12
CollisionBoxes used with rbush  140 KB
CollisionBoxes used with Grid   26 KB

z17
CollisionBoxes used with rbush  1160 KB
CollisionBoxes used with grid   235 KB

rbush could be switched to use the StructArray collision boxes currently used
by Grid but it still might need regular bbox arrays to avoid a cpu performance hit.

The Grid in FeatureTree uses far less memory than rbush. 340 KB vs 1843 KB and 117 KB vs 420 KB.

The Grid in CollisionTile uses more memory than rbush. 225 KB vs 20 KB and 280 KB.

  • this excludes the cost of the actual CollisionBoxes that are inserted into the tree, which is way higher in the rbush version. But the rbush version could possibly be rewritten to use the typed array collision boxes which could reduce that cost
  • this is mostly the memory cost of creating a Int32 subarray for each cell (112 bytes per cell). I'll look into optimizing this for cases where no or few cells are actually used (like the geojson source that is part of the benchmark)

Summary:

  • lower memory usage overall
  • much lower usage for feature indexes
  • possibly high and possibly lower usage for collision indexes: it's hard to separate from other optimizations

@ansis
Copy link
Contributor Author

ansis commented Mar 8, 2016

Buffer does a whole bunch of extra calculations like scaling values and it doesn't provide an interface for reading values. I think it might make sense to wait until dds settles down before trying to merge the two.

  • how stateless workers can be used to write better integration tests for mapbox-gl-js (the tests can be another PR but lets think about the architecture that enables those tests now)

How would stateless workers help us write better integration tests?

  • how we can codify the performance benchmarks used in this PR into something reusable

Yes, I definitely want to write reusable benchmarks. Do we want to start using a framework for this or continue writing it ourselves (like with the buffer benchmark)?

@ansis
Copy link
Contributor Author

ansis commented Mar 8, 2016

ceb13b8 reduces Grid memory usage from 225 KB to 85 KB for the z12 case and from 280 KB to 169 KB for the z17 case.

@mourner
Copy link
Member

mourner commented Mar 8, 2016

@ansis fantastic! I'm all 👍 on the changes then. I'll review the code in more detail.

@lucaswoj
Copy link
Contributor

lucaswoj commented Mar 8, 2016

Buffer does a whole bunch of extra calculations like scaling values

That's not correct. Those "extra calculations" are done in Bucket. Buffer is designed to be a lightweight generic StructArray implementation.

and it doesn't provide an interface for reading values.

It has a Buffer#get method for reading values. The docs say that it is "only used for debugging" as an indication that it is dead code, not as an indication of any technical limitations.

I think it might make sense to wait until dds settles down before trying to merge the two.

The Buffer class will not change in any future dds work.


How would stateless workers help us write better integration tests?

Out of scope for this PR. I'm excited about the implications of a simpler interface for writing better integration tests through skipWorker-like functionality #1979.

test-suite is slow and limited in what it can test. #2223 is a perfect example of a bug that we won't be able to write regression tests for and maybe would've been caught if we had better integration tests.


Yes, I definitely want to write reusable benchmarks. Do we want to start using a framework for this or continue writing it ourselves (like with the buffer benchmark)?

I'm going to design a simple benchmark interface specification today. It'll be similar to how the buffer benchmark works.

@ansis
Copy link
Contributor Author

ansis commented Mar 8, 2016

That's not correct. Those "extra calculations" are done in Bucket. Buffer is designed to be a lightweight generic StructArray implementation.

Yeah, you're right. I confused the two. I'll take a closer look at this today.

The Buffer class will not change in any future dds work.

Great!

I'm going to design a simple benchmark interface specification today. It'll be similar to how the buffer benchmark works.

Is it worth writing our own? Should we just use something like https://github.com/bestiejs/benchmark.js/ ?

@lucaswoj
Copy link
Contributor

lucaswoj commented Mar 8, 2016

benchmark interface specification pushed to #2231 😄

@ansis ansis force-pushed the query-rendered-features-continued branch from ceb13b8 to 806a77b Compare March 9, 2016 01:30
@ansis ansis force-pushed the query-rendered-features-continued branch from 806a77b to ca23cca Compare March 9, 2016 02:25
@ansis
Copy link
Contributor Author

ansis commented Mar 9, 2016

Rebased on master. Latest commit used to be 3fc75bd3127436715d0ec2e88d52aade617ea526.

@ansis
Copy link
Contributor Author

ansis commented Mar 14, 2016

d40a059 merges the similar parts of Buffer and StructArray. After the changes, Buffer is a thing that turns a StructArray into a WebGL buffer. @lucaswoj can you review? (I still need to update tests).

@ansis ansis force-pushed the query-rendered-features-continued branch 2 times, most recently from e7b49f1 to 3530bb2 Compare March 15, 2016 02:13
@lucaswoj
Copy link
Contributor

A convention I've been using, and have found fruitful:

  • object constructors take a single options argument
  • serialize methods output an object compatible with the constructor options argument

@ansis ansis force-pushed the query-rendered-features-continued branch from 1ac623a to 1d8b595 Compare March 16, 2016 01:32
@ansis
Copy link
Contributor Author

ansis commented Mar 16, 2016

A convention I've been using, and have found fruitful
...

switched in fe8dad8


and then I squashed all those commits that fixed things from d40a059

@ansis ansis force-pushed the query-rendered-features-continued branch 2 times, most recently from 21dddbc to 4a48755 Compare March 16, 2016 23:41
@ansis ansis force-pushed the query-rendered-features-continued branch from 6e8bcef to 056fbc0 Compare March 24, 2016 18:49
@ansis ansis merged commit 056fbc0 into features-at-symbols Mar 24, 2016
@ansis ansis deleted the query-rendered-features-continued branch March 24, 2016 19:11
@samanpwbb
Copy link
Contributor

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants