-
-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of first and last reduction #1093
Conversation
That's super helpful, thanks! I approve adding these to Datashader and at a quick glance the code looks good, but I only had a minute to look at it. @ianthomas23 , can you look a little closer just in case? |
Codecov Report
@@ Coverage Diff @@
## master #1093 +/- ##
==========================================
+ Coverage 83.37% 83.50% +0.13%
==========================================
Files 34 34
Lines 7495 7512 +17
==========================================
+ Hits 6249 6273 +24
+ Misses 1246 1239 -7
Continue to review full report at Codecov.
|
Thank you @jbednar for the reply! Glad to contribute. Please let me know if further changes are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good so far, I've left a few review comments.
The NotImplementedError
needs the mention of dask because the _combine
function is called for dask DataFrames
to combine the results of the multiple dask partitions. We don't need to support this method of working because we will be unable to decide which was the overall first
/last
, so we just need to explicitly state that use of dask is not supported here.
We also need a test here, which will remove the Code Coverage complaints. I'd probably opt for a new test in datashader/tests/test_pandas.py
using a small (5x5?) canvas and a single line with x
, y
and z
values that crosses itself. Then use canvas.lines(..., agg=ds.first("z"), ...)
and check the return against a known result, then do the same for ds.last("z")
. If this isn't enough information I can help more!
Thank you @ianthomas23 for the review! Sorry for the late response. Your review comments and suggestions for test implementation were really helpful! Thanks! Hope I understood corectly. Please let me know if further changes are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very nearly there. The tests are very clear, but you could make them simpler by only comparing the numpy arrays agg.data
and sol
rather than the xarray DataArrays. Then the axis
and lincoords
aren't needed at the top of the functions. If you can change this in both test functions and the extra couple of whitespace characters that I just found, it will be ready to merge.
add whitespace characters Co-authored-by: Ian Thomas <ianthomas23@gmail.com>
compare directly the numpy arrays in testing Co-authored-by: Ian Thomas <ianthomas23@gmail.com>
add whitespace character Co-authored-by: Ian Thomas <ianthomas23@gmail.com>
…r-phd into first_last_reduction * 'first_last_reduction' of github.com:tselea/my-datashader-phd: Update datashader/reductions.py
I’ve updated the code based on your last review. Please let me know if further changes should be implemented. I’m grateful for you taking the time to review the code and for all your suggestions. Thanks, @ianthomas23! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tselea for your first PR. We hope to see you again soon!
For my current work, it is useful for me to make use of the first([column]) and last([column]) reductions, which at the moment are implemented only for rasters.
For both reductions, I've initialised a numpy array with numpy.nan values ( create() method). The append() method is implemented as follows:
For both reductions, the finalize() method wraps the numpy array to a xarray DataArray (similar to the count() reduction).
Please let me know if you think this may be a good fit for the project.