Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics API enhancement #7177

Closed
qqmyers opened this issue Aug 11, 2020 · 3 comments · Fixed by #7178
Closed

Metrics API enhancement #7177

qqmyers opened this issue Aug 11, 2020 · 3 comments · Fixed by #7178

Comments

@qqmyers
Copy link
Member

qqmyers commented Aug 11, 2020

Multiple metrics-related issues (such as #6766, #3313, #3527) envision per-dataset metrics. The existing metrics APIs (api/info/metrics and /admin/makeDataCount) do not yet support this and differ in other ways (e.g. how a time-series is reported, error reporting, published or draft datasets, whether objects or sql queries are used).

As part of a QDR effort to implement metrics reporting, I've started trying to standardize aspects of these APIs and, drawing on the code from DANS mentioned in #6766, to add per-dataverse capabilities. I'm submitting a draft PR to make the current state of this work visible to the community and am looking for any/all feedback as to how this can be adapted/extended to address related metrics issues and other community needs.

@qqmyers
Copy link
Member Author

qqmyers commented Aug 21, 2020

https://docs.google.com/spreadsheets/d/1MxcaTtK4Uq_7-4HGt-X2C6hYo7FzrSYYH2mt3ln4M3w/edit?usp=sharing has two sheets -

  • the first lists the existing APIs related to metrics
  • the second shows the additions/changes made to the APIs in this PR, with the changes/additions highlighted in yellow

The most significant changes include:

  • /api/info/metrics/* endpoints can now be called per-dataverse - with results scoped to the tree of dataverses specified by a parent alias (with root as the default) (leverages the work from DANS - thanks!)
  • most /api/info/metrics/* endpoints that return more than a single number now return either json or csv outputs
  • there are timeseries outputs (so one doesn't have to make one call per month to assemble a time series)
  • there is an endpoint listing the tree of subdataverses from a specified parent
  • there are endpoints that report an MDC metric for all datasets within a specified parent dataverse (versus the existing mdc api being one dataset at a time)
  • new endpoints to report file count and aggregate size per mimetype
  • new 'uniquedownloads' endpoints that counts the number of unique downloaders for datasets (all downloads over time by one person, for one or more files in a dataset, all counts as one count. Intended to help in assessing which datasets are popular when datasets may have very different numbers of files.)
  • general cleanup

I'm working now on changes to the dataverse-metrics app to allow use of these endpoints. Other than fixing any bugs I find, I'm not currently planning to do more work on the APIs themselves unless there are comments/requests/feedback on this PR. Depending on what that feedback is I may be able to make changes or we can add/update other issues.

@qqmyers
Copy link
Member Author

qqmyers commented Aug 26, 2020

FYI: The newmetrics branch of dataverse-metrics splits that app into two - a installation-level app showing many of the metrics in this PR (and allowing per-sub-Dataverse metrics), and the original global app that aggregates from any/all dataverses around the world. Still doing some testing/tweaking before making a PR(s) there, but any feedback welcome (from whether these two apps should really be in the same repo to look and feel, etc.).

@qqmyers
Copy link
Member Author

qqmyers commented Sep 9, 2020

Just added download counts by/per file id/pid and unique counts per file endpoints and made minor fixes (added csv for uniquedownloads.) Also added a file donwloads by id graph in dataverse-metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant