Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a script to pull a new version of an ontology on demand #59

Merged

Conversation

syphax-bouazzouni
Copy link

Issue

There is only one where the testing if a new file exists, is in the pull_location CRON job . Which is done once a day.

The problem is that if we do an ontology reprocess with the ncbo_ontolgy_process script, it will not test the existence of a new version.

The solutions

There are two ways to solve this

  1. The first and simple one is to just add a script called ncbo_ontology_pull to do the pull on demand.
  2. The second more complex is to add in the submission process workflow a step that comes before the generate_rdf step called do_pull_location that will download and create a new submission if a new version is found.

This PR is the implementation of the first proposition.

How to use

Usage: ncbo_ontology_pull [options]
    -o, --ontology ACRONYM           Ontology acronym to pull if new version exist
    -h, --help                       Display this screen

@syphax-bouazzouni syphax-bouazzouni changed the title Add a script to pull a new version of an ontology on demand Add a script to pull a new version of an ontology on demand (in progess) Aug 11, 2022
@syphax-bouazzouni syphax-bouazzouni changed the title Add a script to pull a new version of an ontology on demand (in progess) Add a script to pull a new version of an ontology on demand Aug 11, 2022
@alexskr
Copy link
Member

alexskr commented Aug 11, 2022

this would be a very useful script.

@alexskr
Copy link
Member

alexskr commented Dec 22, 2022

I have noticed that the script runs owlapi when pulling ontology. Is that really required? It's not really a big deal but owlapi wrapper will be run for the 2nd time when ontology gets processed.

@syphax-bouazzouni
Copy link
Author

I have noticed that the script runs owlapi when pulling ontology. Is that really required? It's not really a big deal but owlapi wrapper will be run for the 2nd time when ontology gets processed.

Yeah, this behavior was already there before my PR.

They tested if the remote file was parsable (with owlapi) before creating its corresponding submission. It prevents from getting spammed by submissions for staging changes that don't parse.

A possible optimization is to make submissions auto-delete if not parsable and the auto_delete option is set to true e.g sub.process_submission(auto_delete: true) this will call the owlapi only once and delete the submission if not parsable.

I can do that if wanted, but it is beyond this PR, I think.

@alexskr alexskr merged commit d1f8aa7 into ncbo:master Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants