Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new entity type Subject to add keywords to data publications #327

Open
RKrahl opened this issue Mar 15, 2024 · 1 comment
Open

Add a new entity type Subject to add keywords to data publications #327

RKrahl opened this issue Mar 15, 2024 · 1 comment
Labels
schema this involves changes to the ICAT schema
Milestone

Comments

@RKrahl
Copy link
Member

RKrahl commented Mar 15, 2024

The DataPublication class has a property subject with the description "List of keywords". It is a simple string property, although it is typically multi valued in practice.

This was done according to my own proposal (#200). The intention back then was to keep things simple. In the practice, you can set multiple keywords by setting a comma (or semicolon) separated list in the string value, e.g. set something like subject = "Light Management; Perovskite/Silicon Tandem Solar Cells; Nanotexture; Bayesian Optimization; Finite Element Method". When generating DataCite metadata out of that, this may translate to:

  <subjects>
    <subject>Light Management</subject>
    <subject>Perovskite/Silicon Tandem Solar Cells</subject>
    <subject>Nanotexture</subject>
    <subject>Bayesian Optimization</subject>
    <subject>Finite Element Method</subject>
  </subjects>

This works fine as long as you only need to set simple keywords. But the DataCite property Subject has subproperties subjectScheme, schemeURI, and valueURI and we might want to use them in order to set subjects from a controlled vocabulary, as in:

  <subjects>
    <subject schemeURI="http://purl.org/pan-science/PaNET/" subjectScheme="The Photon and Neutron Experimental Techniques Ontology" valueURI="http://purl.org/pan-science/PaNET/PaNET01217">neutron diffraction</subject>
  </subjects>

That is very difficult to encode in the current ICAT schema.

So I suggest to add the following new entity type:

Subject

Subject, keyword, classification code, or key phrase describing a data publication

Constraint: dataPublication, name

Relationships:

Card Class Field
1,1 DataPublication dataPublication

Other fields:

Field Type
name String[255] NOT NULL
pid String[255]
subjectScheme String[255]
schemeURI String[255]
valueURI String[255]
classificationCode String[255]

(Obviously, this would add the new corresponding one-to-many relation to DataPublication as well.)

@RKrahl RKrahl added the schema this involves changes to the ICAT schema label Mar 15, 2024
@RKrahl
Copy link
Member Author

RKrahl commented Apr 25, 2024

As discussed in the ICAT Schema Discussion on April 2nd and in the collaboration meeting today, this change should be in a version 7.0 release that we aim to make in the second half of this year.

@RKrahl RKrahl added this to the 7.0.0 milestone Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
schema this involves changes to the ICAT schema
Projects
None yet
Development

No branches or pull requests

1 participant