Implementation of first and last reduction #1093

tselea · 2022-06-12T08:03:06Z

For my current work, it is useful for me to make use of the first([column]) and last([column]) reductions, which at the moment are implemented only for rasters.

For both reductions, I've initialised a numpy array with numpy.nan values ( create() method). The append() method is implemented as follows:

first([column]): check if field is not null and if agg[y,x] is null (has not yet been filled), then set agg[y,x] = field
last([column]): check if field is not null, then set agg[y,x] = field

For both reductions, the finalize() method wraps the numpy array to a xarray DataArray (similar to the count() reduction).

Please let me know if you think this may be a good fit for the project.

jbednar · 2022-06-12T15:53:21Z

That's super helpful, thanks! I approve adding these to Datashader and at a quick glance the code looks good, but I only had a minute to look at it. @ianthomas23 , can you look a little closer just in case?

codecov · 2022-06-12T16:15:20Z

Codecov Report

Merging #1093 (cf4bd43) into master (a795a9a) will increase coverage by 0.13%.
The diff coverage is 87.50%.

@@            Coverage Diff             @@
##           master    #1093      +/-   ##
==========================================
+ Coverage   83.37%   83.50%   +0.13%     
==========================================
  Files          34       34              
  Lines        7495     7512      +17     
==========================================
+ Hits         6249     6273      +24     
+ Misses       1246     1239       -7

Impacted Files	Coverage Δ
datashader/reductions.py	`84.67% <87.50%> (+1.18%)`	⬆️
datashader/transfer_functions/__init__.py	`86.90% <0.00%> (+0.29%)`	⬆️
datashader/glyphs/trimesh.py	`92.36% <0.00%> (+0.50%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a795a9a...cf4bd43. Read the comment docs.

tselea · 2022-06-12T17:06:13Z

Thank you @jbednar for the reply! Glad to contribute. Please let me know if further changes are needed.

datashader/reductions.py

ianthomas23

This is looking good so far, I've left a few review comments.

The NotImplementedError needs the mention of dask because the _combine function is called for dask DataFrames to combine the results of the multiple dask partitions. We don't need to support this method of working because we will be unable to decide which was the overall first/last, so we just need to explicitly state that use of dask is not supported here.

We also need a test here, which will remove the Code Coverage complaints. I'd probably opt for a new test in datashader/tests/test_pandas.py using a small (5x5?) canvas and a single line with x, y and z values that crosses itself. Then use canvas.lines(..., agg=ds.first("z"), ...) and check the return against a known result, then do the same for ds.last("z"). If this isn't enough information I can help more!

datashader/reductions.py

…t reduction

tselea · 2022-06-15T07:39:13Z

Thank you @ianthomas23 for the review! Sorry for the late response.

Your review comments and suggestions for test implementation were really helpful! Thanks! Hope I understood corectly.
The line exemple that crosses itself that I used for testing is depicted below.

Please let me know if further changes are needed.

ianthomas23

This is very nearly there. The tests are very clear, but you could make them simpler by only comparing the numpy arrays agg.data and sol rather than the xarray DataArrays. Then the axis and lincoords aren't needed at the top of the functions. If you can change this in both test functions and the extra couple of whitespace characters that I just found, it will be ready to merge.

datashader/reductions.py

datashader/tests/test_pandas.py

add whitespace characters Co-authored-by: Ian Thomas <ianthomas23@gmail.com>

compare directly the numpy arrays in testing Co-authored-by: Ian Thomas <ianthomas23@gmail.com>

add whitespace character Co-authored-by: Ian Thomas <ianthomas23@gmail.com>

…tests

…r-phd into first_last_reduction * 'first_last_reduction' of github.com:tselea/my-datashader-phd: Update datashader/reductions.py

tselea · 2022-06-18T08:49:18Z

I’ve updated the code based on your last review. Please let me know if further changes should be implemented.

I’m grateful for you taking the time to review the code and for all your suggestions. Thanks, @ianthomas23!

ianthomas23

Thanks @tselea for your first PR. We hope to see you again soon!

tselea added 4 commits June 12, 2022 10:14

implement append, create and finalize methods for first() reduction

19f9d9e

implement append, create, finalize for last() reduction

c995d94

add numba.jit() decorator

cb6bbd1

add cuda=False parameter to last finalize() method

385eaa5

ianthomas23 reviewed Jun 13, 2022

View reviewed changes

datashader/reductions.py Outdated Show resolved Hide resolved

ianthomas23 requested changes Jun 13, 2022

View reviewed changes

datashader/reductions.py Outdated Show resolved Hide resolved

datashader/reductions.py Show resolved Hide resolved

datashader/reductions.py Outdated Show resolved Hide resolved

tselea added 3 commits June 14, 2022 15:40

fix missing whitespace after ","

75eafe7

update NotImplementedError message for _combine() method in first/las…

6f54b0d

…t reduction

added tests for first/last reduction on pandas DF

59e26e4

Merge branch 'master' into first_last_reduction

c57e45a

tselea requested a review from ianthomas23 June 17, 2022 15:37

ianthomas23 requested changes Jun 17, 2022

View reviewed changes

datashader/reductions.py Outdated Show resolved Hide resolved

datashader/reductions.py Outdated Show resolved Hide resolved

datashader/tests/test_pandas.py Outdated Show resolved Hide resolved

tselea and others added 5 commits June 18, 2022 10:51

Update datashader/reductions.py

f002add

add whitespace characters Co-authored-by: Ian Thomas <ianthomas23@gmail.com>

Update datashader/tests/test_pandas.py

7a9b780

compare directly the numpy arrays in testing Co-authored-by: Ian Thomas <ianthomas23@gmail.com>

Update datashader/reductions.py

dd98b58

add whitespace character Co-authored-by: Ian Thomas <ianthomas23@gmail.com>

compare the numpy arrays directly for first() and last() reduction's …

e110bf8

…tests

Merge branch 'first_last_reduction' of github.com:tselea/my-datashade…

cf4bd43

…r-phd into first_last_reduction * 'first_last_reduction' of github.com:tselea/my-datashader-phd: Update datashader/reductions.py

tselea requested a review from ianthomas23 June 18, 2022 08:49

ianthomas23 approved these changes Jun 18, 2022

View reviewed changes

ianthomas23 merged commit d5b5635 into holoviz:master Jun 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of first and last reduction #1093

Implementation of first and last reduction #1093

tselea commented Jun 12, 2022 •

edited by jbednar

Loading

jbednar commented Jun 12, 2022

codecov bot commented Jun 12, 2022 •

edited

Loading

tselea commented Jun 12, 2022

ianthomas23 left a comment

tselea commented Jun 15, 2022

ianthomas23 left a comment

tselea commented Jun 18, 2022

ianthomas23 left a comment

Implementation of first and last reduction #1093

Implementation of first and last reduction #1093

Conversation

tselea commented Jun 12, 2022 • edited by jbednar Loading

jbednar commented Jun 12, 2022

codecov bot commented Jun 12, 2022 • edited Loading

Codecov Report

tselea commented Jun 12, 2022

ianthomas23 left a comment

Choose a reason for hiding this comment

tselea commented Jun 15, 2022

ianthomas23 left a comment

Choose a reason for hiding this comment

tselea commented Jun 18, 2022

ianthomas23 left a comment

Choose a reason for hiding this comment

tselea commented Jun 12, 2022 •

edited by jbednar

Loading

codecov bot commented Jun 12, 2022 •

edited

Loading