-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flag use of GISAID data in exported Auspice JSON #691
Comments
I think this is a good direction, but I'd rather have this more generic. For example
|
I think this would be very useful, but think Richard's idea is a good one, this would leave open to other possibilities of acknowledgement in the future, if desired. |
Great idea @rneher. Perhaps in this case we'd want an array? I could imagine entries like:
Do we want this to be free-form or would it be a restricted library of options? There could be entries like:
for Ebola. And in this direction, you could imagine asking for full URLs so others can better track down data. The above would then be:
Or if we wanted to get to the point of automatically flagging data sources we'd want something like we have for maintainers, ie:
In this direction, the name GISAID or the URL gisaid.org would be special cased in Auspice to use the "enabled by data from GISAID" logo, but other uses could work like maintainers do now. |
I'd probably opt for the most expressive solution. Seems most future proof. |
What if the data provenance was an annotation in the metadata itself (e.g., the These annotations could allow Auspice to determine the data source attributions from the metadata itself and save us from manual curation of an Auspice config file that could get out of sync with the data (for instance, if we add COG data to a build but forget to update the auspice JSON). With those annotations in the metadata, we could also enable filtering in Auspice by the data source field. We track this information for some (most?) pathogens in fauna already. We would still need to maintain some mapping of the terms in the metadata to data source records that contain full name, URL, etc. for display in Auspice (something like the last example above, but maybe indexed by metadata field values). That mapping could live wherever makes the most sense technically. |
I see what you're going for here John. But I think a more manageable initial solution is the
block in the Auspice JSON. This would be manually updated for the moment, but could be something that's eventually automatically generated from I'm bumping priority of this issue as it's now clear that it's necessary for others to surface this information. |
This generalized approach sounds great to me. So then what does the expression for GISAID look like then?
And auspice would turn that into a "enabled by data from |
Excellent point. I was imagining that we'd special-case a situation of
would be "Enabled by data from ((GISAID)) and COG UK" where "((GISAID))" is the logo that links to gisaid.org and "COG UK" is text that links to www.cogconsortium.uk. However, automatically grabbing an image is an interesting idea. There'd need to be someway to size this appropriately however. |
Yeah I think either GISAID gets special treatment in this spec or it doesn't, so it's either something like I originally proposed (where even the image url is fully described.. but then yes... dimensions) or it could be as simple as:
No point asking the user to supply the URL to GISAID if we're already supplying the logo. |
Adds support for a `data_provenance` field in the auspice v2 config and exported auspice v2 JSONs through additions of schema definitions for `data_provenance` and inclusion of an example provenance entry in the Zika build's auspice config. Fixes #691
Context
Currently, we look at URL via Auspice searching for nextstrain.org/ncov to know to insert "enabled by data from GISAID" into the byline (https://github.com/nextstrain/auspice/blob/master/src/components/info/byline.js#L26). However, this pretty limiting and there are emerging pages via
/groups
and/community
that are using GISAID data.Description
I think we should allow another element in the Auspice JSON schema of:
gisaid_data: true
. Auspice would look at this value to decide whether of not to flag "enabled by data from GISAID" in the byline.This would require updating
augur export
to allowgisaid_data
in the--auspice-config
JSON file and would require updating the JSON v2 schema here: https://github.com/nextstrain/augur/blob/master/augur/data/schema-export-v2.json.With Augur updated, we'd then need to update the
ncov
build and then once live JSON files have been updated with this field, we can update Auspice to work from this input rather than from URL string.The text was updated successfully, but these errors were encountered: