-
Notifications
You must be signed in to change notification settings - Fork 488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GDCC/8605-add-archival-status-support #8696
Changes from 12 commits
de62791
8c82c61
b354bc3
9c9ac65
221ca0b
8902d9a
a37922b
cefa12c
e1c62af
d2bf93c
ae1c97c
d3a7b04
5295bcd
9223e7d
f5396d8
986f9ff
8750e62
5d617f0
8fcb59c
d2d817e
6a70d42
e498417
8a99685
7362e1c
7410c5b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1873,6 +1873,61 @@ The API call requires a Json body that includes the list of the fileIds that the | |
export JSON='{"fileIds":[300,301]}' | ||
|
||
curl -H "X-Dataverse-key: $API_TOKEN" -H "Content-Type:application/json" "$SERVER_URL/api/datasets/:persistentId/files/actions/:unset-embargo?persistentId=$PERSISTENT_IDENTIFIER" -d "$JSON" | ||
|
||
|
||
Get the Archival Status of a Dataset By Version | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Archiving is an optional feature that may be configured for a Dataverse instance. When enabled, this API call be used to retrieve the status. Note that this requires "superuser" credentials. | ||
|
||
/api/datasets/submitDatasetVersionToArchive/$dataset-id/$version/status returns the archival status of the specified dataset version. | ||
|
||
The response is a Json object that will contain a "status" which may be "success", "pending", or "failure" and a "message" which is archive system specific. For "success" the message should provide an identifier or link to the archival copy. For example: | ||
|
||
.. code-block:: bash | ||
|
||
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | ||
export SERVER_URL=https://demo.dataverse.org | ||
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/7U7YBV | ||
export VERSION=1.0 | ||
|
||
curl -H "X-Dataverse-key: $API_TOKEN" -H "Accept:application/json" "$SERVER_URL/api/datasets/submitDatasetVersionToArchive/$VERSION/status?persistentId=$PERSISTENT_IDENTIFIER" | ||
|
||
Set the Archival Status of a Dataset By Version | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Archiving is an optional feature that may be configured for a Dataverse instance. When enabled, this API call be used to set the status. Note that this is intended to be used by the archival system and requires "superuser" credentials. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You have to give DRS (or whatever archival system) a superuser token? Hmm, seems a bit suboptimal but I suppose anything else is not an easy fix. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah - this is an example of info that shouldn't be editable by a user who can touch that dataset (as it represents the state of an external archiving system) yet doesn't seem to fit being in /api/admin (limited to localhost usually which would make it inaccessible, or with unblock-key access would allow the archiver to make all the other admin calls). It may be that signed URLs would help here, e.g. giving the archiver URLs to set archival status for a limited time. |
||
|
||
/api/datasets/submitDatasetVersionToArchive/$dataset-id/$version/status sets the archival status of the specified dataset version. | ||
|
||
The body is a Json object that must contain a "status" which may be "success", "pending", or "failure" and a "message" which is archive system specific. For "success" the message should provide an identifier or link to the archival copy. For example: | ||
|
||
.. code-block:: bash | ||
|
||
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | ||
export SERVER_URL=https://demo.dataverse.org | ||
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/7U7YBV | ||
export VERSION=1.0 | ||
export JSON='{"status":"failure","message":"Something went wrong"}' | ||
|
||
curl -H "X-Dataverse-key: $API_TOKEN" -H "Content-Type:application/json" -X PUT "$SERVER_URL/api/datasets/submitDatasetVersionToArchive/$VERSION/status?persistentId=$PERSISTENT_IDENTIFIER" -d "$JSON" | ||
|
||
Delete the Archival Status of a Dataset By Version | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Archiving is an optional feature that may be configured for a Dataverse instance. When enabled, this API call be used to delete the status. Note that this is intended to be used by the archival system and requires "superuser" credentials. | ||
|
||
/api/datasets/submitDatasetVersionToArchive/$dataset-id/$version/status deletes the archival status of the specified dataset version. | ||
|
||
.. code-block:: bash | ||
|
||
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | ||
export SERVER_URL=https://demo.dataverse.org | ||
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/7U7YBV | ||
export VERSION=1.0 | ||
|
||
curl -H "X-Dataverse-key: $API_TOKEN" -H "Content-Type:application/json" -X DELETE "$SERVER_URL/api/datasets/submitDatasetVersionToArchive/$VERSION/status?persistentId=$PERSISTENT_IDENTIFIER" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need application/json on DELETE? Less is more, right? |
||
|
||
|
||
Files | ||
----- | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1187,4 +1187,12 @@ private DatasetVersion getPreviousVersionWithUnf(DatasetVersion datasetVersion) | |
return null; | ||
} | ||
|
||
/** | ||
* Merges the passed datasetversion to the persistence context. | ||
* @param ver the DatasetVersion whose new state we want to persist. | ||
* @return The managed entity representing {@code ver}. | ||
*/ | ||
public DatasetVersion merge( DatasetVersion ver ) { | ||
return em.merge(ver); | ||
} | ||
Comment on lines
+1195
to
+1197
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm surprised this merge method doesn't already exist on DatasetVersionServiceBean.java. It is because most changes to versions happen through commands? It is because once a version is published there's no need to go back and change the version (except for deaccessioning, I guess, which is a command)? I don't think it's bad to add this method but I wonder why we're only adding it now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah - I think everything uses a Command of some sort. I was also surprised that it didn't exist as the dataset service has a merge() and the file service has several methods that don't do much more than a merge. |
||
} // end class |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -87,6 +87,7 @@ | |
import edu.harvard.iq.dataverse.util.json.JSONLDUtil; | ||
import edu.harvard.iq.dataverse.util.json.JsonLDTerm; | ||
import edu.harvard.iq.dataverse.util.json.JsonParseException; | ||
import edu.harvard.iq.dataverse.util.json.JsonUtil; | ||
import edu.harvard.iq.dataverse.search.IndexServiceBean; | ||
import static edu.harvard.iq.dataverse.util.json.JsonPrinter.*; | ||
import static edu.harvard.iq.dataverse.util.json.NullSafeJsonBuilder.jsonObjectBuilder; | ||
|
@@ -216,6 +217,9 @@ public class Datasets extends AbstractApiBean { | |
@Inject | ||
DataverseRoleServiceBean dataverseRoleService; | ||
|
||
@EJB | ||
DatasetVersionServiceBean datasetversionService; | ||
|
||
/** | ||
* Used to consolidate the way we parse and handle dataset versions. | ||
* @param <T> | ||
|
@@ -3282,4 +3286,107 @@ public Response getCurationStates() throws WrappedResponse { | |
csvSB.append("\n"); | ||
return ok(csvSB.toString(), MediaType.valueOf(FileUtil.MIME_TYPE_CSV), "datasets.status.csv"); | ||
} | ||
|
||
//APIs to manage archival status | ||
|
||
@GET | ||
@Produces(MediaType.APPLICATION_JSON) | ||
@Path("/submitDatasetVersionToArchive/{id}/{version}/status") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. submitDatasetVersionToArchive is a weird name. submitDataVersionToArchive (Data instead of Dataset) is under /api/admin and documented under installation/config.html There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. So far it ~mirrors the /api/admin/submitDatasetVersionToArchive call (name changed to say 'Dataset' in #8610 which hasn't merged yet), which seemed reasonable when it was a single call. With the status calls, I initially had them in /api/admin as well, but eventually decided they should move to /api/datasets (see the comment about superuser being required on those). With that, they could be renamed - e.g. to /api/datasets/<id>/<version>/archivalStatus . There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like the new name ending with |
||
public Response getDatasetVersionToArchiveStatus(@PathParam("id") String dsid, | ||
@PathParam("version") String versionNumber) { | ||
|
||
try { | ||
AuthenticatedUser au = findAuthenticatedUserOrDie(); | ||
if (!au.isSuperuser()) { | ||
return error(Response.Status.FORBIDDEN, "Superusers only."); | ||
} | ||
Dataset ds = findDatasetOrDie(dsid); | ||
|
||
DatasetVersion dv = datasetversionService.findByFriendlyVersionNumber(ds.getId(), versionNumber); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any reason not to use getDatasetVersionOrDie here (and in the other two calls to findByFriendlyVersionNumber in this PR)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure I saw it but looking now, getDatasetVersionOrDie doesn't support the friendlyVersionNumber syntax which is a ~requirement here (that's the convention used in the Bag naming and metadata that the archiver gets). I can go ahead and add parsing for that which would have the presumably useful side effect of letting other datasetversion api calls support the friendly version number as well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should. I'm seeing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep - you're right. I missed the string parsing in handleVersion(). I'll update the PR to use it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm - calls to this are counted with MakeDataCounts. I guess since these are API calls they should count? (although they are clearly system-level interactions and not end-user interaction with the data). In any case, I went ahead for now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I dunno. I'd leave this out of Make Data Count. Like you said, these are systems setting and retrieving archival status messages. The spirit of Make Data Count is views/investigations and downloads/requests. People and machines looking at data. |
||
if (dv.getArchivalCopyLocation() == null) { | ||
return error(Status.NO_CONTENT, "This dataset version has not been archived"); | ||
} else { | ||
JsonObject status = JsonUtil.getJsonObject(dv.getArchivalCopyLocation()); | ||
return ok(status); | ||
} | ||
} catch (WrappedResponse wr) { | ||
return wr.getResponse(); | ||
} | ||
} | ||
|
||
@PUT | ||
@Consumes(MediaType.APPLICATION_JSON) | ||
@Path("/submitDatasetVersionToArchive/{id}/{version}/status") | ||
public Response setDatasetVersionToArchiveStatus(@PathParam("id") String dsid, | ||
@PathParam("version") String versionNumber, JsonObject update) { | ||
|
||
logger.info(JsonUtil.prettyPrint(update)); | ||
pdurbin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
try { | ||
AuthenticatedUser au = findAuthenticatedUserOrDie(); | ||
|
||
if (!au.isSuperuser()) { | ||
return error(Response.Status.FORBIDDEN, "Superusers only."); | ||
} | ||
} catch (WrappedResponse wr) { | ||
return wr.getResponse(); | ||
} | ||
if (update.containsKey(DatasetVersion.STATUS) | ||
&& update.containsKey(DatasetVersion.MESSAGE)) { | ||
String status = update.getString(DatasetVersion.STATUS); | ||
if (status.equals(DatasetVersion.PENDING) | ||
|| status.equals(DatasetVersion.FAILURE) | ||
|| status.equals(DatasetVersion.SUCCESS)) { | ||
pdurbin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
try { | ||
Dataset ds; | ||
|
||
ds = findDatasetOrDie(dsid); | ||
|
||
DatasetVersion dv = datasetversionService.findByFriendlyVersionNumber(ds.getId(), versionNumber); | ||
if(dv==null) { | ||
return error(Status.NOT_FOUND, "Dataset version not found"); | ||
} | ||
|
||
dv.setArchivalCopyLocation(JsonUtil.prettyPrint(update)); | ||
dv = datasetversionService.merge(dv); | ||
logger.info("location now: " + dv.getArchivalCopyLocation()); | ||
logger.info("status now: " + dv.getArchivalCopyLocationStatus()); | ||
logger.info("message now: " + dv.getArchivalCopyLocationMessage()); | ||
|
||
return ok("Status updated"); | ||
|
||
} catch (WrappedResponse wr) { | ||
return wr.getResponse(); | ||
} | ||
} | ||
} | ||
return error(Status.BAD_REQUEST, "Unacceptable status format"); | ||
} | ||
|
||
@DELETE | ||
@Produces(MediaType.APPLICATION_JSON) | ||
@Path("/submitDatasetVersionToArchive/{id}/{version}/status") | ||
public Response deleteDatasetVersionToArchiveStatus(@PathParam("id") String dsid, | ||
@PathParam("version") String versionNumber) { | ||
|
||
try { | ||
AuthenticatedUser au = findAuthenticatedUserOrDie(); | ||
if (!au.isSuperuser()) { | ||
return error(Response.Status.FORBIDDEN, "Superusers only."); | ||
} | ||
Dataset ds = findDatasetOrDie(dsid); | ||
|
||
DatasetVersion dv = datasetversionService.findByFriendlyVersionNumber(ds.getId(), versionNumber); | ||
if (dv == null) { | ||
return error(Status.NOT_FOUND, "Dataset version not found"); | ||
} | ||
dv.setArchivalCopyLocation(null); | ||
dv = datasetversionService.merge(dv); | ||
|
||
return ok("Status deleted"); | ||
|
||
} catch (WrappedResponse wr) { | ||
return wr.getResponse(); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come
:persistentId
isn't in the URL? Are database IDs supported as well as PIDs? They should be, like all other native API endpoints.