-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epic: Harvard Dataverse Repository NIH Metrics #217
Labels
Comments
Status: March 2024
|
cmbz
added
GREI 4
Analytics and Reporting
Project: NIH GREI
Tasks related to the NIH GREI project
labels
Apr 1, 2024
21 tasks
Status: April 2024
|
Status: May 2024
|
Closed
5 tasks
Status: June 2024
|
2 tasks
Status: July 2024
|
Status: August 2024
|
Status: September 2024
|
Status: October 2024
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Overview
Tracking issue for monthly reports of NIH-funded datasets in Harvard Dataverse.
Resources
How these metrics are gathered for the monthly reports
At the beginning of each month @jggautier runs a Python script that:
@jggautier then reviews any datasets that were included in previous months but removed, reviews the metadata of newly added datasets to make sure there's actually some indication of NIH funding, removes any datasets that aren't from NIH-funded research, and adjusts the script so that those datasets are ignored when the script is used again. The script is also adjusted to include datasets that @jggautier and colleagues know have been funded by the NIH and are missing such indications in their metadata.
Search details
The Python script uses the Search API to look across four metadata fields - Funding Information Agency, Contributor Name, Description, and Notes - for the full name of the NIH and its acronym and the full names of all NIH centers and institutes and most of their acronyms.
When looking through metadata in the Description field and Notes field, the script also looks for variations of the words "fund", "sponsor", "award", and "support" to increase the chances that it finds only datasets with metadata that acknowledges NIH funding.
The text was updated successfully, but these errors were encountered: