-
Notifications
You must be signed in to change notification settings - Fork 488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuring file stores (storage drivers) for individual datasets #7272
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. The only ~issue I see is that there's no UI indicator that a Dataset isn't inheriting the storageDriver from it's Dataverse, so it may not be obvious when a Dataset has been switched. No reason that can be a future issue though.
We don't otherwise have a UI indicator showing which store the files are going to, when it's inherited from the dataverse... do we? |
… checked... all needed to be changed not to assume that it's inherited from the parent dataverse. (#6872)
…IQSS/dataverse into 6872-direct-upload-for-datasets
Good catch! I was just thinking that an admin can check (or change) the Dataverse setting under General Information, but that value may not be what the dataset has. |
(I meant to reply to this comment, not to edit it earlier; on my phone, sorry)
|
Please note that this is not part of my PR - this is an existing API; that was merged some months ago. |
FWIW: TDL has s3 with label "TDL" and s3tacc with label "TACC". Harvard might want labels like 'Normal' and 'Large Data', etc. |
I was thinking something along these lines - there may be a situation where we need to spell out the name of the institution that owns a specific storage location (or even the name of the grant that pays for it)... We may need/want to display this kind of info on the page later on... |
True; just checked in a fix. Should be printing the same "no such dataset" error message as the PUT and DELETE versions now. |
What this PR does / why we need it:
Even though the title of the issue specifically mentions "direct S3 upload", the PR adds something more general - an ability to designate a file store for a specific dataset. The file store in question does not have to be S3. But enabling direct S3 upload is the main use case behind this PR. For example, if we want to enable direct upload for a specific dataset in prod., without opening it for everybody, we will achieve it by
curl -H "X-Dataverse-key: XXX" -X PUT -d s3direct http://localhost:8080/api/datasets/NNNN/storageDriver
The words "API based" in the issue title refer to the fact that a file store can be configured for a dataset via API only; i.e. there's no GUI for that (like we have for dataverses). But once a direct upload-enabled store has been assigned to a dataset, the uploads will work both via the dataset page, or the API (using the DVUploader utility); just like when it's enabled dataverse-wide.
Which issue(s) this PR closes:
Closes #6872
Special notes for your reviewer:
Suggestions on how to test this:
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: