-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrospectively assign group attribution for existing GO-CAMs #458
Comments
Obviously this is not working as intended. Let's break this down. The first thing to check is the model itself, let's see if your work is attributed to the relevant groups in the model - can you post a link to a model? |
|
Hello @cmungall You're welcome to use the model entitled 'SynGO no. 1a SLC1A1_PMID:7914198' as an example: It falls under SynGO-UCL. Thanks, |
Thanks! OK, so in this model there is no providedBy at the level of the evidence assertions. That's as expected as you didn't know about the groups ability when making these in March. Now don't worry we're not going to make you go back and retrospectively manually tag every evidence instance with your project, we'll figure out a solution here. One idea we kicked around today was this one: so the idea would be that 'orphan' evidences would inherit from the model level (prospectively, you would try and make sure you wear the correct 'hat' whenever making new assertions). However, we need to discuss this further. Thanks for your patience! |
I think a model-level providedBy could be a suitable solution. |
For looking at the scope of the providedBy issues, here are some numbers: Of the 595 non-SYNGO models...
(Also, we have 98 models that include some kind of GOC xref for the user; I believe we should map those forward: geneontology/go-site#453 (comment)) Given that this problem is unlikely to be manually tractable, I would suggest the following approach once the metadata is updated, but would like feedback and agreement from all parties:
The middle step there is for hoping that groups that are using the groups to their full effect (to track different funding sources, etc.) can give us a list of those at the model level so we can add those with a quick script. This is more a @cmungall or @balhoff question, but I'm not actually sure of the state of our use of model-level For those who want to look at getting these numbers out of the files on the command line: grep -c provided 0*.ttl 5*.ttl | grep "\:0" | cut -d ':' -f 1 | xargs -d '\n' bash -c 'for filename; do grep -H orcid "$filename" | sort | uniq; done;' | cut -d ':' -f 1 | sort | uniq -c | sort -nr | grep "1 " | wc |
Moving forward:
For now, make report public; later, make sanity check. All people in users.yaml who are Noctua/GO-CAM editors need to understand that the top group is the default one. If we cannot make this clear, we should add a new required field: Once the above migration is done, create a spreadsheet has:
(We may tag @dougli1sqrd to help out with that spreadsheet generation.) Once that is filled in, it will be back to the software group to figure out the implementation strategy. |
@vanaukenk looking at the output of the scripts, we have 177 violations. The script will run in a public place tonight; I'll try and add the link soon. |
@kltm -thanks. If you have the link for the script output, I can add it to the agenda for the annotation call tomorrow: |
@vanaukenk |
Okay, thanks. Right now I'm getting this when I click on the link above: Not Found The requested URL /snapshot/metadata/users-and-groups-report.txt was not found on this server. |
Clicking on the above works for me (now). It is possible that you looked at the snapshot while the data products were rebuilding. |
Okay, thanks @kltm I can see the report now, and will add the link to the 2017-11-28 annotation call minutes on the wiki. |
Please keep in mind that the report is generated as part of the new build and a new attempt is made every night. This means that 1) if the load fails the report may not be there (which is a good thing) and 2) there will times every day during regeneration when the report is not yet there. |
Some additional notes for my benefit to run the report, from g-site:
looks like we have a lot of people missing groups. Many may no longer be active so it would be good to indicate these |
@cmungall @kltm there are also group IRIs in Noctua models for which there is no entry in |
Okay, these are getting in by two of our workbenches. Essentially, reqs.use_groups([
// WARNING: We're minting money that we
// might not honor here.
'http://purl.obolibrary.org/go/groups/' + assby
]); This also relates to #539 |
@cmungall |
List of production models without model-level providedBy: http://yasgui.org/short/HJSFxyvFM I tried a similar query that dug into providedBy on particular statements and got the same number of models, so I think it's fine to use this simpler query. 61 models—maybe not too many to just look at individually? |
Once |
A bit of a slog, but not impossible. If we make a spreadsheet or something that curators can get and and update, I have a tool that we can use to just inject those into the models. Just to confirm, currently, if there is no contributes/provided by annotation "locally" present, the exporter grabs it from the model level annotations? |
@kltm yes that is true. |
This query adds contributor ID and name: http://yasgui.org/short/Hy0bwkwtG It only creates 4 additional rows, so nearly all of these models have only one contributor. It's a very short list of folks; I think this will be easy to sort out. |
Query for groups used in production models: http://yasgui.org/short/SJvdsyvFz |
@balhoff @dougli1sqrd - when was that triplestore last updated? Will we be able to sparql the source store (or something closely synced to it) soon? |
Yes, let's do this. |
@cmungall I updated the triplestores earlier today. |
As a step in this process (and discussion today), @dougli1sqrd will be looking at fixing up the users.yaml and groups.yaml. As part of this change, any Noctua editors who do not have an ORCID as their |
covered by geneontology/noctua-models#84, and done |
Hi
I have raised this issue a couple of times, obviously I should have created a ticket so that I can be kept in the loop about progress in this area.
Currently there is no incentive for my team at UCL to use Noctua to create annotation because the annotations we create will not appear in AmiGO or QuickGO as attributed to one of the UCL teams, therefore I will not be able to create a report for my grant funding bodies on the annotations created and demonstrate that I have met my funded targets.
In addition, It is important that any SynGO Noctua annotations that we make that are exported to the GOC annotation files are attributed to SynGO-UCL as SynGO VU curators want to be able to distinguish their 'expert created' annotations from those UCL is creating.
In addition, Tony has pointed out that currently he is unable to pick up the Noctua annotations. Consequently the GO annotations displayed by QuickGO, NCBI Gene, Ensembl, UniProt and the files incorporated into many functional analysis tools are out of sync with the data displayed in AmiGO.
Best
Ruth
@cmungall @tonysawfordebi @BarbaraCzub @thomaspd
The text was updated successfully, but these errors were encountered: