Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of first and last reduction #1093

Merged
merged 13 commits into from
Jun 18, 2022

Conversation

tselea
Copy link
Contributor

@tselea tselea commented Jun 12, 2022

For my current work, it is useful for me to make use of the first([column]) and last([column]) reductions, which at the moment are implemented only for rasters.

For both reductions, I've initialised a numpy array with numpy.nan values ( create() method). The append() method is implemented as follows:

  • first([column]): check if field is not null and if agg[y,x] is null (has not yet been filled), then set agg[y,x] = field
  • last([column]): check if field is not null, then set agg[y,x] = field

For both reductions, the finalize() method wraps the numpy array to a xarray DataArray (similar to the count() reduction).

Please let me know if you think this may be a good fit for the project.

@jbednar
Copy link
Member

jbednar commented Jun 12, 2022

That's super helpful, thanks! I approve adding these to Datashader and at a quick glance the code looks good, but I only had a minute to look at it. @ianthomas23 , can you look a little closer just in case?

@codecov
Copy link

codecov bot commented Jun 12, 2022

Codecov Report

Merging #1093 (cf4bd43) into master (a795a9a) will increase coverage by 0.13%.
The diff coverage is 87.50%.

@@            Coverage Diff             @@
##           master    #1093      +/-   ##
==========================================
+ Coverage   83.37%   83.50%   +0.13%     
==========================================
  Files          34       34              
  Lines        7495     7512      +17     
==========================================
+ Hits         6249     6273      +24     
+ Misses       1246     1239       -7     
Impacted Files Coverage Δ
datashader/reductions.py 84.67% <87.50%> (+1.18%) ⬆️
datashader/transfer_functions/__init__.py 86.90% <0.00%> (+0.29%) ⬆️
datashader/glyphs/trimesh.py 92.36% <0.00%> (+0.50%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a795a9a...cf4bd43. Read the comment docs.

@tselea
Copy link
Contributor Author

tselea commented Jun 12, 2022

Thank you @jbednar for the reply! Glad to contribute. Please let me know if further changes are needed.

Copy link
Member

@ianthomas23 ianthomas23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good so far, I've left a few review comments.

The NotImplementedError needs the mention of dask because the _combine function is called for dask DataFrames to combine the results of the multiple dask partitions. We don't need to support this method of working because we will be unable to decide which was the overall first/last, so we just need to explicitly state that use of dask is not supported here.

We also need a test here, which will remove the Code Coverage complaints. I'd probably opt for a new test in datashader/tests/test_pandas.py using a small (5x5?) canvas and a single line with x, y and z values that crosses itself. Then use canvas.lines(..., agg=ds.first("z"), ...) and check the return against a known result, then do the same for ds.last("z"). If this isn't enough information I can help more!

datashader/reductions.py Outdated Show resolved Hide resolved
datashader/reductions.py Show resolved Hide resolved
datashader/reductions.py Outdated Show resolved Hide resolved
@tselea
Copy link
Contributor Author

tselea commented Jun 15, 2022

Thank you @ianthomas23 for the review! Sorry for the late response.

Your review comments and suggestions for test implementation were really helpful! Thanks! Hope I understood corectly.
The line exemple that crosses itself that I used for testing is depicted below.
line_example

Please let me know if further changes are needed.

@tselea tselea requested a review from ianthomas23 June 17, 2022 15:37
Copy link
Member

@ianthomas23 ianthomas23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nearly there. The tests are very clear, but you could make them simpler by only comparing the numpy arrays agg.data and sol rather than the xarray DataArrays. Then the axis and lincoords aren't needed at the top of the functions. If you can change this in both test functions and the extra couple of whitespace characters that I just found, it will be ready to merge.

datashader/reductions.py Outdated Show resolved Hide resolved
datashader/reductions.py Outdated Show resolved Hide resolved
datashader/tests/test_pandas.py Outdated Show resolved Hide resolved
tselea and others added 5 commits June 18, 2022 10:51
add whitespace characters

Co-authored-by: Ian Thomas <ianthomas23@gmail.com>
compare directly the numpy arrays in testing

Co-authored-by: Ian Thomas <ianthomas23@gmail.com>
add whitespace character

Co-authored-by: Ian Thomas <ianthomas23@gmail.com>
…r-phd into first_last_reduction

* 'first_last_reduction' of github.com:tselea/my-datashader-phd:
  Update datashader/reductions.py
@tselea
Copy link
Contributor Author

tselea commented Jun 18, 2022

I’ve updated the code based on your last review. Please let me know if further changes should be implemented.

I’m grateful for you taking the time to review the code and for all your suggestions. Thanks, @ianthomas23!

@tselea tselea requested a review from ianthomas23 June 18, 2022 08:49
Copy link
Member

@ianthomas23 ianthomas23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tselea for your first PR. We hope to see you again soon!

@ianthomas23 ianthomas23 merged commit d5b5635 into holoviz:master Jun 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants