Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate enabling date range queries for the /timelapse endpoint #127

Closed
1 task
leothomas opened this issue May 19, 2021 · 1 comment
Closed
1 task
Assignees
Labels
enhancement New feature or request

Comments

@leothomas
Copy link
Contributor

This issue builds off of PR #113

The /timelapse endpoint returns zonal stats for a single time unit (month or day), which can be cumbersome for API consumers (eg: 2 year timelapse of a monthly dataset requires 24 API queries).

While it's very straightforward to implement a date_range parameter in the query which causes the API to return an array of zonal stats. ie:

# current behaviour: 
GET /timelapse -d {"date": "YYYY_MM_DD", ... } --> {"mean": ..., "median" } 

# additional behaviour to implement: 
GET /timelapse -d {"date_range": ["YYYY_MM_DD_start", "YYYY_MM_DD_end"], ... } --> {
    "YYYY_MM_DD-1": {"mean": ..., "median": ... }
    "YYYY_MM_DD-2": {"mean": ..., "median": ... }
    ...
}

NOTE: calculating zonal stats requires loading the COG into memory. For a big enough AOI this calculation can already be quite time consuming, repeating this calculation in a loop for a large time range (eg: daily dataset for 2 years is over 700 zonal_stats calculations) might break things, or at best be very slow.

TODO:

  • Implement looping zonal stats calculation --> stress test feature and review implementation if large date ranges/AOI's cause excessive response time
@leothomas
Copy link
Contributor Author

Update:

I've implemented a threaded stats calculation that processes the requested date range in batches of 15. I ran some tests to get an idea of run time for this endpoint with different date ranges and areas. Here are the results:

Dataset: CO2 (daily COGs)

Los Angles (106 994 km^2) California (999 714 km^2) Continental U.S. (14 376 697 km^2)
2 days 0.6 s 0.7 s 0.8 s
1 week 0.8 s 1.1 s 1.5 s
1 month 2.4 s 2.9 s 4.9 s
6 months 11.6 s 13.9 s 26.5 s
1 year (~330 COGs to load) 21.8 s 27.0 s breaks API 30s timeout

I've deployed this work into a test stack accessible at this endpoint: https://l47o73bjpk.execute-api.us-east-1.amazonaws.com/v1/timelapse

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant