Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converter.prefixmap should be a bimap #96

Closed
matentzn opened this issue Nov 24, 2023 · 4 comments
Closed

Converter.prefixmap should be a bimap #96

matentzn opened this issue Nov 24, 2023 · 4 comments

Comments

@matentzn
Copy link
Collaborator

Right now, the prefixmap in the converter object (converter.prefixmap) is a 1:n object, which means that any prefix can be linked to a number of prefixes, which makes it ambiguous (or rather, for those who like splitting hairs, order-dependent). In my opinion, this here should pass (but it does not):

def test_bimap(self):
    epm = [{
    "prefix": "Orphanet",
    "prefix_synonyms": [
        "orphanet.ordo"
    ],
    "uri_prefix": "http://www.orpha.net/ORDO/Orphanet_" }]
    converter = Converter.from_extended_prefix_map(epm)
    self.assertTrue('Orphanet' in converter.prefix_map)
    self.assertFalse('orphanet.ordo' in converter.prefix_map)

This is important, because otherwise I cannot control, as a user, which prefix (not, uri-prefix) should be used in SSSOM. Right now, both are included in the exported curie map, eg.

#   Orphanet: http://www.orpha.net/ORDO/Orphanet_
#   orphanet.ordo: http://www.orpha.net/ORDO/Orphanet_

but only the second, the one I do not want, decides over which prefix should be used during compression. So there are two issues here:

  1. I want a prefix map that is a bimap to ship with my data asset (i.e. the sssom file)
  2. I want to be certain that the "prefix", not the "prefix_synonyms" get to dictate the prefix during compression.

Is this an implementation issue with the prefixmap, or do we need a special extension, converter.bimap to cover this.

See mapping-commons/sssom-py#469

@matentzn matentzn changed the title prefixmap should be a bimap Converter.prefixmap should be a bimap Nov 24, 2023
@cthoyt
Copy link
Member

cthoyt commented Nov 24, 2023

@matentzn
Copy link
Collaborator Author

Hmm weird my ide didn't see it. Thanks! Anyhow, this only solves problem number 1. what about compress?

@cthoyt
Copy link
Member

cthoyt commented Nov 27, 2023

I am not sure I understand what you are asking. When compressing, it only ever goes to the preferred prefix, which is dictated by the structure of the EPM.

If you use the prefix map data structure that has duplicates of the URI prefix, then there can't be any guarantees which one gets made the "primary" (I think the implementations picks the first)

@matentzn
Copy link
Collaborator Author

matentzn commented Dec 5, 2023

Not relevant anymore

@matentzn matentzn closed this as completed Dec 5, 2023
@matentzn matentzn closed this as not planned Won't fix, can't repro, duplicate, stale Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants