Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated reporting [WiP] #46

Open
7 of 19 tasks
adam3smith opened this issue Jun 29, 2020 · 7 comments
Open
7 of 19 tasks

Automated reporting [WiP] #46

adam3smith opened this issue Jun 29, 2020 · 7 comments
Assignees

Comments

@adam3smith
Copy link

adam3smith commented Jun 29, 2020

We would like automated / automateable reports of QDR users and usage at regular (likely monthly) intervals.
They fulfill a number of user stories:

  • As QDR, we want to be able to easily survey and track key usage metrics of our site and data holdings
  • As QDR, we want to be able to easily provide an overview of key metrics to our advisory boards 4 times a year
  • As QDR, we want to provide specific metrics about usage by researchers of a given institution to our institutional members as well as prospective institutional members.

General Reporting

Dataverse

  • overall number of datasets and files
  • new datasets since last month/report
  • total data project views since last month/report
  • total data file downloads since last month/report
  • most viewed and downloaded data project (don't bias towards data project with lots of files)
  • total number of visitors to data.qdr.syr.edu
  • ability to create timelines for up to 24 months for these
  • ability to create custom reports for a given timeline

Drupal

  • total number of registered users
  • new registered users since last month/report
  • total site views since last month/report
  • most visited pages since last month/report
  • any spike in user registration from a given domain (>3/month)
  • returning users: list of people logging-in >3 times or after >2 weeks hiatus
  • % of users with no log in over last 12 months

Institution specific reporting

  • DV statistics for institutional dataverse (if it exists)
  • Registered users with email address from institution
  • Datasets (published/unpublished) from users at institution (using affiliation of authors or email if affiliation is QDR IP)
  • [maybe] Visitors from institution's IP Range
@adam3smith
Copy link
Author

@qqmyers could we please rename "Dataverse" to "Collection" and "Dataset" to "Data project" in line with standard QDR terminology?

@qqmyers
Copy link
Member

qqmyers commented Sep 10, 2020

updated

@adam3smith
Copy link
Author

Could we please also re-order this:

  1. All DV download stats
  2. All MDC stats
  3. Data Project counts & subjects
  4. All File stats
  5. All collection stats

@adam3smith
Copy link
Author

And also, as discussed on Slack, let's add start dates to the MDC metrics

@qqmyers
Copy link
Member

qqmyers commented Sep 10, 2020

Hows this? - rearranged the graphs, added panels to allow you to collapse the download metrics, MDC, and holdings-related metrics separately, and gave each of those sections a title that can be configured, which I used to add the MDC start dates.

Also, for 5.0, time series graphs will only start when the total is non-zero, so the MDC Unique graphs will actually start with June 2020 when the first counts come. You can see that on dev. I could add it to v4.20 as well and push that out if you want that change now.

@adam3smith
Copy link
Author

The relevant Drupal endpoints are https://qdr.syr.edu/login-history and https://qdr.syr.edu/use-stats
The DV endpoint we'll add here once we're happy with it

@qqmyers
Copy link
Member

qqmyers commented Jul 30, 2021

One issue I've noticed with the DV collection subject area graph is that Dataverse appears to add any subjects from a child dataverse to the parent. Further, those aren't deleted if the child collection is. So, on prod, the graph shows ~14 entries for 7 collections. Nominally I think Dataverse should at least remove 'orphaned' subjects when a child DV is deleted, but the question of whether parent collections should add the subject of child collections at all, and/or if only leaf collections, or those with datasets themselves should be graphed, etc. Technically, the graph is a correct representation of what's in the db at this point, but that may not give the clearest indication of subject range/popularities.

qqmyers pushed a commit that referenced this issue Apr 12, 2023
* set default when no config is set for signposting
* modification according to reviews
* move long json string from code to bunddle
* allow empty config on the level 2 profile
* revision based on Herbert feedback
* coding style cleanup SignpostingResources
* remove leading comma
* fix capitalize with header name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants