-
-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dynspread not to count already-spread pixels and respect array bnds #1001
Conversation
Nice! My main question I have is if you looked at possible performance implications?
😬 |
I have not done any benchmarking, but I wasn't able to spot performance differences in my informal testing. It gets slower as the image size increases and as max_px increases (but only for images sparse enough to approach max_px). |
Very nice! I'll give this PR a go tomorrow morning and review. |
I've pushed a version of dynspread that defines density for a given radius as "fraction of non-empty pixels that have a neighbor within the radius". This is the first time dynspread has had a consistent, accurate, and describable interpretation, rather than being an opaque metric whose threshold can only be set empirically. It also seems to work very well for points plots, enlarging the spread just enough in the various points cases so that points become visible up to the However, it no longer works at all for line plots, because by construction all points along the line always have an immediate neighbor, and so no spreading is ever applied. I'm still pondering whether there is a meaningful dynspread metric that (a) works using pixels alone, as it does now, yet (b) appropriately spreads line plots with low overlap while not spreading collections of many overlapping lines. |
I've always felt that a single dynamic spreading approach for multiple glyphs (e.g points versus lines) would have such tradeoffs. Imho what lines really need is a dynamic thickness (with antialiasing) instead of a convolution type approach. At the holoviews level we could know which glyph is being rasterized and choose different code paths that way. At the datashader level it would be up to the user to choose a dynspread approach appropriate to what is being rasterized. Of course, it would be even better If you can think of a single, easy-to-explain approach that works for both points and lines! I also think that the work in this PR to improve dynspread behavior would be good to make available, even if it only works for points. |
To-do items: Ok, after discussion with @jlstevens and @philippjfr, we agreed:
To-do list before merging: (Jim)
To-do after merging:
To-do after releasing datashader 0.13:
|
Tests are passing on python 3.6, but failing on later versions because of holoviz/spatialpandas#68 and possibly other issues with related libraries. Merging this now, but still need to address the downstream issues. |
Fixes #1000. As described in #1000, dynspread was not working as documented or designed, because it was iteratively measuring the density of already-spread pixels, which produced large dense regions that were indistinguishable from dense regions of datapoints. Also, it turns out, the density calculations were reading outside the bounds of the array, which is dangerous and likely to give incorrect readings for density, even though it does not involve writing outside the array.
Fixed both problems by (1) respecting the bounds of the array when computing density, and (2) iteratively measuring density using an increasing neighborhood (as before) but no longer with actual spreading for each interative step (unlike before). I.e., instead of spreading then checking to see if the density metric is exceeded, simply tries increasingly large values of a neighborhood radius to count how many pixels have neighbors within this radius, and stopping when that normalized number reaches the specified threshold. The neighborhood search is now twice as large as the spreading amount being tested, since unlike actual spreading each datapoint has to check distance to the neighboring datapoint's center, not edge, but it no longer has to do actual spreading, so the speed may not decrease much.
In any case, the behavior is now vastly more appropriate, and is approximately how I first envisioned dynspread working, smoothly switching to larger point sizes (even for large
max_px
values of e.g. 100) on zooming in. The default threshold of 0.5 actually still seems about right to me, so no need to update that. Themax_px
default of 3 now seems low; 5 seems more visible and clear once zooming in. But I'll leave the default value unchanged here since really it's the default value in the plotting library (e.g. HoloViews hv.operation.datashader.dynspread) that affects interactive use.Here's an example using the datashader_dashboard, with
max_px=10
(fairly big!):DynspreadZoom.mp4