Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation with large point clouds and show condition on M1 Mac #11270

Open
ulrichson opened this issue May 5, 2023 · 22 comments
Open

Comments

@ulrichson
Copy link

Hello!

I observed a performance degradation when rendering large points clouds that use the conditional show attribute. I'm certain that the performance was much better, in a project I'm using such a technique to filter for attributes from the tileset's batch table and now the framerate drops a lot when the filters are applied. Maybe it only affects macOS / arm architecture — on a Linux / Intel workstation with NVIDIA GPU it works fine.

A quick performance monitoring in the dev tools showed that readPixels consumes a lot of time which is called for scene picking. Is there a way to turn off picking for Cesium3DTileset ? That might help in my case.

I can reproduce this in a sandcastle with the Melbourne point cloud, and you can see the framerate drops from ~100 fps to ~10 fps when zooming in:

Screen.Recording.2023-05-05.at.12.22.49.720p.mov

Sandcastle example:
https://sandcastle.cesium.com/#c=hVNha9swEP0rRxiNA0FO10K71AkbKYyMjkBT9slfFPkai8m6IJ2TuiX/vXJsZ0nZmMFg3b33dO9JjmP47qRlmKHXZfFjCVIp9B6YoKLSgSYL0ntkn1pF1jNsNe7QwQQs7lqa+HWoRWlPHdYzsiy1RZf2BnepTW0kfWUVRAOYTOEttQDsquYDWkGBL4w2i1rFptgsru6ftEE/t36Disn91C/a1sI1uxmKawBymErupO7siDM+snh2VMzJfqsNzbPo+urLzW0n1EqIDWnLM0Nltsxlpu06iLaThkdymLKUHHIZBxMlDo8trPCeCnzQ65wD7T/tJTu0a87HcPlPzGPYv/R/EPu7s8S8Qoti43ShWW/RC5llUWujc9XE0RJeiYon+gjpjHuuDJ6f62l+y7odHZM4xLTUrziG66MBRYbcGPpuvYo+j0bD9h30jwif0y4APr3NFg+Lx71wMIWRuIGLCzipJaF2229NN3PuQUlWOUToHLlBdyT18ZNBYWjddhp0aveDqP7uDXvJwdm0rn/VxYYcQ+lMJETMWGyMZPTxqlS/QwbK+5qUxB0lyfQWdDb5y9UGZcKfETrPpTF1EmlvmsQBf0YzdLhEiy06I6sAqcdI8svpQ9MQQiRxWNabfuQykVlJd6L7Dg

Browser: Google Chrome Version 113.0.5672.63 (Official Build) (arm64)

But similar experience with Safari / Firefox

Operating System: macOS 13.3.1 (a) on M1 Max

@ggetz
Copy link
Contributor

ggetz commented May 11, 2023

Thanks for the report @ulrichson!

I'm certain that the performance was much better

Do you mean in a previous release, or just when an M1 machine is not being used?

We have seen some rendering problems on M1 machines, but they tend to be rendering artifacts rather than degraded performance.

In general, readPixels is an expensive operation, and we'd (hopefully soon) like to optimize picking to avoid it as much as possible.

@ulrichson
Copy link
Author

@ggetz Glad to hear, thanks!

I mean with a previous Cesium release. To verify I just ran an older version of my project where Cesium 1.91.0 was used and here the performance is good on M1. With 1.105.0 the performance is worse. I don't know between which versions the performance drop happened. If it helps, I can try to figure it out.

Is there a way to disable picking for Cesium3DTileset (assuming it then wouldn't call the expensive readPixels)? I wouldn't use it for point clouds anyway.

@ggetz
Copy link
Contributor

ggetz commented May 12, 2023

I mean with a previous Cesium release. To verify I just ran an older version of my project where Cesium 1.91.0 was used and here the performance is good on M1. I don't know between which versions the performance drop happened. If it helps, I can try to figure it out.

Thanks for clarifying! And we'd definitely appreciate the help. 1.97 would be the first release I'd suspect that this began occurring. There was a major 3D Tiles and model refactor that went out as a part of that release.

Is there a way to disable picking for Cesium3DTileset (assuming it then wouldn't call the expensive readPixels)? I wouldn't use it for point clouds anyway.

Not through the public API. If you're willing to modify the source code, it's possible to skip the 3D Tiles pass when picking.

@ulrichson
Copy link
Author

@ggetz I re-tested with 1.97 and I can confirm that the performance drop was introduced with this version. The previous release had better performance.

Can you please give me a hint where in the source code this change would be required?

@ggetz
Copy link
Contributor

ggetz commented May 19, 2023

There was a fairly large refactor that went in with that change. @j9liu would you be able to recommend a place to start looking regarding changes from ModelExperimental related to performance picking point clouds?

@ulrichson
Copy link
Author

@j9liu @ggetz any updates, on how to disable point cloud picking in the cesium code? thanks!

@j9liu
Copy link
Contributor

j9liu commented Jul 5, 2023

Hey @ulrichson,

Sorry about the delay. This completely slipped my radar. I'm happy to take a look -- I should have a moment by the end of this week.

@ulrichson
Copy link
Author

Thanks @j9liu 😊

@j9liu
Copy link
Contributor

j9liu commented Jul 6, 2023

Hi @ulrichson,

I don't have an M1 Mac, and I can't reproduce this behavior on Windows. But I'm curious to see the debug profile that demonstrates the difference with / without this condition. readPixels has always been slow, so I'd want to know why the function read is slower with the show condition on (if that's what you claim). It'd be different if the time leading up to the readPixels is slower -- I imagine the buildDrawCommands in Model could be slowing things down, since it's called every time a style is applied.

In any case, there's an individual allowPicking on the individual Model level. Even though there's no equivalent
on Cesium3DTileset, it itself uses Model to render tile content. You can experiment with setting it false in the constructor and seeing if it helps. Let us know how it goes so we can try to troubleshoot from there

@ulrichson
Copy link
Author

Hi @j9liu, thanks, I'll give it a try!

Here's a snapshot of the profile on my M1 machine when the show is set as in the sandcastle example above:

Hello World - Cesium Sandcastle 2023-07-07 16-52-31

@ulrichson
Copy link
Author

Update: maybe it's not related to the readPixels (and picking) as I thought. The project I'm working on is using the show attribute, and here readPixels doesn't seem to be an issue. Did possibly change something in the shaders that could cause this?

@j9liu
Copy link
Contributor

j9liu commented Jul 7, 2023

In 1.97 we switched to a completely different Model implementation, so yes. But I have a hard time understanding what in the new architecture could be slowing things down.

Here's the state of CesiumJS in the 1.96 release. You can see how PointCloud.js was used to render point cloud content, instead of Model. Here is where the point cloud shaders were created.

In the new Model architecture, the shaders of a Model are built incrementally using various "pipeline stages". So the pipeline stage responsible for adding point cloud styling code is PointCloudStylingPipelineStage.js. But you can see at line 290, the show condition function is derived the same way as it was in PointCloud.js.

So that's why I'm confused. If the new Model architecture was slower as a whole, I would understand, because it takes more time to construct / reconstruct shaders for a model, and that can get exacerbated by picking. But I don't see how simply adding a show styling condition makes everything slower. If you're able to save the performance profiles you gather in Chrome (both with the show condition and without), and attach them, I can look at them more closely.

@ulrichson
Copy link
Author

Sure, thanks for looking into it!

Here's the profile with show: trace-with-show.json.zip

And without: trace-without-show.json.zip

@j9liu
Copy link
Contributor

j9liu commented Jul 17, 2023

Thanks @ulrichson ! I'll try to take a look by the end of this week :)

@j9liu
Copy link
Contributor

j9liu commented Jul 21, 2023

@ulrichson

Thanks for your patience. I took a look at the profiles and could also see that the readPixels function was taking 3x as much time with the show condition, vs. without.

I'm trying to think of why this could be. Picking actually involves re-rendering the scene in a small area, then sampling that rectangle for an object. So technically, with picking, the scene (and all of its models) is updated and drawn twice. This is definitely not the most efficient method for picking, but perhaps picking is exacerbated by things that slow down the render loop itself.

I've looked at these parts of the profile, and it does seem like the biggest offender is PickFrameBuffer.end (which then calls readPixels, but the time that render takes with the show condition is strangely less than the time without the condition. Granted, these profiles could be changed based on how much the mouse is actually wiggling across the screen during the time of recording. But it's hard to tell why that's happening and what's actually going on.

Without condition With
image image
image image

Unfortunately, in the profiles I can't go any level deeper than "Animation Frame Fired" -- I can't see if there's any bottleneck in particular when the scene renders. But the GPU performance is definitely worse with the show condition. It looks like the longest time it takes on the GPU without the show condition is 6ms, where the GPU sometimes goes above 100ms with the show condition.

If I could reproduce this myself, I would go through the Model architecture and try to leave out extraneous parts of the pipeline, like the PickingPipelineStage or the PointCloudStylingPipelineStage, and see what happens. This part of the shader code is what is responsible for the show condition in point clouds: I also wonder if it makes a difference if the show condition is always set to true. In other words, is it the presence of the show condition that slows things down? Or is it the evaluation of the condition itself?

Update: maybe it's not related to the readPixels (and picking) as I thought. The project I'm working on is using the show attribute, and here readPixels doesn't seem to be an issue. Did possibly change something in the shaders that could cause this?

Does this imply that the show attribute still causes issues, even with picking disabled for point clouds?

@ilyaly
Copy link
Contributor

ilyaly commented Jul 25, 2023

We are experiencing the same performance issue , see #11196 . Our tilesets are generated with Agisoft Metashape. Prior to version 1.97 there were no problems but starting from 1.97 we observe the same behavior as reported in this issue the only difference is that we do not apply any styling.

@ulrichson
Copy link
Author

@j9liu Thanks! I tested with show: 'true' (and also an expression that always evaluates to true, i.e. show: '${COLOR}.r > -1 && ${COLOR}.r < 2' and then there's no performance drop. Same observation as without the show.

Does this imply that the show attribute still causes issues, even with picking disabled for point clouds?

It seems so, I made the following change in packages/engine/Source/Scene/Model/ModelRuntimePrimitive.js‎:

if (model.allowPicking) {
  console.warn('Disabled PickingPipelineStage')
  // pipelineStages.push(PickingPipelineStage);
}

With the disabled PickingPipelineStage the performance still drops - so it's really not related to picking.

I did another test and tried to fake the show behavior with a conditional color style, i.e. color: '${COLOR}.r > 0.7 && ${COLOR}.r < 0.8 ? rgb(200,200,200) : rgba(0,0,0,0)' (see Sandcastle example). Now, the performance doesn't drop, so I think it can be narrowed down to the show-behavior. Unfortunately, I can't use this workaround since the pointCloudShading doesn't work with transparent colors.

@ulrichson
Copy link
Author

Another observation: I checked if pointCloudShading together with show causes the performance drop - but that's not the case.

@aurivus-ph
Copy link

I have a similar observation with a large performance hit on integrated Intel graphics (in particular Intel® UHD Graphics for 10th Gen Intel® Processors as in the i7-10510U). The issue does not show on the dedicated GPUs I have access to.

I have found a one-line fix, that improves the performance a lot in my case:

In packages/engine/Source/Shaders/Model/CPUStylingStageVS.glsl in cpuStylingStage(), comment out the following line:

void cpuStylingStage(inout vec3 positionMC, inout SelectedFeature feature)
{
    float show = ceil(feature.color.a);
    //positionMC *= show; // this line causes the performance hit

    #if defined(HAS_SELECTED_FEATURE_ID_ATTRIBUTE) && !defined(HAS_CLASSIFICATION)
    filterByPassType(positionMC, feature.color);
    #endif
}

Note that you may need to re-build Cesium to re-generate the CPUStylingStageVS.js file. Otherwise the changes won't take effect.

To anyone who is able to reproduce this performance issue, please check if this improves the performance for you.

My theory is as follows:

  • the line sets positionMC to zero, in case it is hidden (i.e. show is zero)
    • I think this was intended to move it off-screen, skipping the rasterizer and making it invisible
    • however, this is the wrong place to do it: positionMC is in model coordinates (not screen space yet), so we just move the points to the origin of the tile / tileset, which is likely to still be in view
  • the "hidden" points all get rasterized in one position, causing significant overdraw, which causes the performance hit
  • the "hidden" points finally get discarded in the fragment shader

A proper fix for that would be to actually move the hidden points off-screen, i.e. setting gl_Position accordingly.

@katSchmid
Copy link

katSchmid commented Aug 17, 2023

We have that on all Os and hard ware variants. I have a gui setting shader dynamically. Even applying a new style without anything but point size keeps low frame rates. ReadPixels is big in my profile for me also when setting pixel colors.

Applying a new default style is a large performance hit on fps

@katSchmid
Copy link

Any updates, we are no changing our data as a workaround but love to get this back

@jjrise
Copy link

jjrise commented Sep 13, 2024

@katSchmid did you ever find any resolution to this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants