Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit bounding box filtering by aggregate_polygon #101

Closed
jdries opened this issue Dec 10, 2019 · 8 comments
Closed

Implicit bounding box filtering by aggregate_polygon #101

jdries opened this issue Dec 10, 2019 · 8 comments
Labels
help wanted Extra attention is needed work in progress
Milestone

Comments

@jdries
Copy link
Contributor

jdries commented Dec 10, 2019

In the geopyspark backend, we currently use the bounding box of the geometry/feature collection specified in 'aggregate_polygon' as an alternative to 'filter_bbox' or spatial_extent in load_collection.

This is very convenient for the user, as he does not have to extract the bounds himself, which can be annoying, especially when working with large files, or when trying to write process graphs that take any file as input.

Is this something that we should clarify in the process definition, or can we leave it implicit? (Compatibility across backends would be affected if other backends choose not to support this.).

This unit test in our backend tries to show the type of process graphs that we want to support:
https://github.com/Open-EO/openeo-python-driver/blob/5c32f52363de96977745e6c9d43214e8086b4e70/tests/test_views_execute.py#L499

@m-mohr
Copy link
Member

m-mohr commented Dec 10, 2019

I guess the best way to find out would be to just make a PR and propose this change on the call on Thursday.

@m-mohr m-mohr added the help wanted Extra attention is needed label Dec 10, 2019
@lforesta
Copy link
Contributor

This is very convenient for the user, as he does not have to extract the bounds himself, which can be annoying, especially when working with large files, or when trying to write process graphs that take any file as input.

For this the user can use filter_polygon instead of filter_bbox, no?

Anyway, I also would implicitly filter data when using aggregate_polygon (we have not yet implemented this process)

@jdries
Copy link
Contributor Author

jdries commented Dec 12, 2019

Telco conclusion: agreed to improve documentation.
Some notes:

  • Explain the mechanism and conditions when a spatial filter can be 'pushed down'
  • Explain that backends might still throw an exception if the push-down behavior is not supported
  • Point user to 'filter_polygon' which can make his process graph more robust across different backends

@m-mohr
Copy link
Member

m-mohr commented Dec 13, 2019

What is "push-down behavior", @jdries? Seems like an internal optimization?! I don't think this is something a user would really understand or care about. It sounds reasonable to just say that aggregate_* also implicitly applies filter_*.

@m-mohr m-mohr added this to the v1.0 milestone Dec 13, 2019
m-mohr added a commit that referenced this issue Dec 17, 2019
…e bounds of the polygons as if filter_polygon would have been used afterwards. #101
@m-mohr
Copy link
Member

m-mohr commented Dec 17, 2019

See PR #109 for a proposal.

@lforesta
Copy link
Contributor

PR #109 ?

@m-mohr
Copy link
Member

m-mohr commented Dec 17, 2019

Indeed, 101 is this issue.

@m-mohr m-mohr closed this as completed Jan 6, 2020
@m-mohr
Copy link
Member

m-mohr commented Jan 6, 2020

Merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed work in progress
Projects
None yet
Development

No branches or pull requests

3 participants