-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add discovery algorithm #93
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing effort, made a first round of comments on this! THANKS
src/curies/discovery.py
Outdated
records = [] | ||
record_number = 0 | ||
for uri_prefix, luids in sorted(counter.items()): | ||
if len(luids) > cutoff: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a matter of lack of brain power - how does this code prevent, given http://purl.obolibrary.org/yoyo/TMP_123
to add both http://purl.obolibrary.org/yoyo/TMP_
and http://purl.obolibrary.org/yoyo/
as prefixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does in priority order of the delimiter first / then # then _. When it starts with /, it sees TMP_123
isn't alphanumeric, so it moves on, and therefore does not discover http://purl.obolibrary.org/yoyo/
This PR adds a workflow that iterates through a list of URIs and tries to discovery new common URI prefixes. Docs: https://curies.readthedocs.io/en/discovery/discovery.html
Algorithm
It follows this basic algorithm given by @matentzn:
#
,/
)ns1
,ns2
, ...) for each discovered URI prefixDemo
Assume you have some ontology with randomly generated URIs with the prefix
http://ran.dom/
(but you don't know this ahead of time). You can use thecuries.discover
function to create a converter that has a dummy CURIE prefixns1
for this URI prefix.Use Cases