Skip to content

Commit

Permalink
Merge branch 'develop' of github.com:IQSS/dataverse into 9692-files-a…
Browse files Browse the repository at this point in the history
…pi-extension-display-data
  • Loading branch information
GPortas committed Aug 1, 2023
2 parents 1ff9d90 + d9f0952 commit a8a367a
Show file tree
Hide file tree
Showing 17 changed files with 279 additions and 153 deletions.
15 changes: 15 additions & 0 deletions doc/release-notes/8889-2-filepids-in-collections-changes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
The default for whether PIDs are registered for files or not is now false.

Installations where file PIDs were enabled by default will have to add the :FilePIDsEnabled = true setting to maintain the existing functionality.

Add step to install:

If your installation did not have :FilePIDsEnabled set, you will need to set it to true to keep file PIDs enabled:

curl -X PUT -d 'true' http://localhost:8080/api/admin/settings/:FilePIDsEnabled



It is now possible to allow File PIDs to be enabled/disabled per collection. See the [:AllowEnablingFilePIDsPerCollection](https://guides.dataverse.org/en/latest/installation/config.html#allowenablingfilepidspercollection) section of the Configuration guide for details.

For example, registration of PIDs for files can now be enabled in a specific collection when it is disabled instance-wide. Or it can be disabled in specific collections where it is enabled by default.
7 changes: 5 additions & 2 deletions doc/sphinx-guides/source/admin/dataverses-datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,8 @@ In the following example, the database id of the file is 42::

export FILE_ID=42
curl "http://localhost:8080/api/admin/$FILE_ID/registerDataFile"
This method will return a FORBIDDEN response if minting of file PIDs is not enabled for the collection the file is in. (Note that it is possible to have file PIDs enabled for a specific collection, even when it is disabled for the Dataverse installation as a whole. See :ref:`collection-attributes-api` in the Native API Guide.)

Mint PIDs for all unregistered published files in the specified collection
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -162,7 +164,8 @@ The following API will register the PIDs for all the yet unregistered published

curl "http://localhost:8080/api/admin/registerDataFiles/{collection_alias}"

It will not attempt to register the datafiles in its sub-collections, so this call will need to be repeated on any sub-collections where files need to be registered as well. File-level PID registration must be enabled on the collection. (Note that it is possible to have it enabled for a specific collection, even when it is disabled for the Dataverse installation as a whole. See :ref:`collection-attributes-api` in the Native API Guide.)
It will not attempt to register the datafiles in its sub-collections, so this call will need to be repeated on any sub-collections where files need to be registered as well.
File-level PID registration must be enabled on the collection. (Note that it is possible to have it enabled for a specific collection, even when it is disabled for the Dataverse installation as a whole. See :ref:`collection-attributes-api` in the Native API Guide.)

This API will sleep for 1 second between registration calls by default. A longer sleep interval can be specified with an optional ``sleep=`` parameter::

Expand All @@ -171,7 +174,7 @@ This API will sleep for 1 second between registration calls by default. A longer
Mint PIDs for ALL unregistered files in the database
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The following API will attempt to register the PIDs for all the published files in your instance that do not yet have them::
The following API will attempt to register the PIDs for all the published files in your instance, in collections that allow file PIDs, that do not yet have them::

curl http://localhost:8080/api/admin/registerDataFileAll

Expand Down
2 changes: 2 additions & 0 deletions doc/sphinx-guides/source/admin/make-data-count.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ Enable or Disable Display of Make Data Count Metrics

By default, when MDC logging is enabled (when ``:MDCLogPath`` is set), your Dataverse installation will display MDC metrics instead of it's internal (legacy) metrics. You can avoid this (e.g. to collect MDC metrics for some period of time before starting to display them) by setting ``:DisplayMDCMetrics`` to false.

You can also decide to display MDC metrics along with Dataverse's traditional download counts from the time before MDC was enabled. To do this, set the :ref:`:MDCStartDate` to when you started MDC logging.

The following discussion assumes ``:MDCLogPath`` has been set to ``/usr/local/payara5/glassfish/domains/domain1/logs/mdc``

Configure Counter Processor
Expand Down
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -746,7 +746,7 @@ The following attributes are supported:
* ``name`` Name
* ``description`` Description
* ``affiliation`` Affiliation
* ``filePIDsEnabled`` ("true" or "false") Enables or disables registration of file-level PIDs in datasets within the collection (overriding the instance-wide setting).
* ``filePIDsEnabled`` ("true" or "false") Restricted to use by superusers and only when the :ref:`:AllowEnablingFilePIDsPerCollection <:AllowEnablingFilePIDsPerCollection>` setting is true. Enables or disables registration of file-level PIDs in datasets within the collection (overriding the instance-wide setting).


Datasets
Expand Down
45 changes: 40 additions & 5 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ this provider.
- :ref:`:Shoulder <:Shoulder>`
- :ref:`:IdentifierGenerationStyle <:IdentifierGenerationStyle>` (optional)
- :ref:`:DataFilePIDFormat <:DataFilePIDFormat>` (optional)
- :ref:`:FilePIDsEnabled <:FilePIDsEnabled>` (optional, defaults to true)
- :ref:`:FilePIDsEnabled <:FilePIDsEnabled>` (optional, defaults to false)

.. _pids-handle-configuration:

Expand Down Expand Up @@ -297,7 +297,7 @@ Here are the configuration options for PermaLinks:
- :ref:`:Shoulder <:Shoulder>`
- :ref:`:IdentifierGenerationStyle <:IdentifierGenerationStyle>` (optional)
- :ref:`:DataFilePIDFormat <:DataFilePIDFormat>` (optional)
- :ref:`:FilePIDsEnabled <:FilePIDsEnabled>` (optional, defaults to true)
- :ref:`:FilePIDsEnabled <:FilePIDsEnabled>` (optional, defaults to false)

.. _auth-modes:

Expand Down Expand Up @@ -2775,14 +2775,35 @@ timestamps.
:FilePIDsEnabled
++++++++++++++++

Toggles publishing of file-level PIDs for the entire installation. By default this setting is absent and Dataverse Software assumes it to be true. If enabled, the registration will be performed asynchronously (in the background) during publishing of a dataset.
Toggles publishing of file-level PIDs for the entire installation. By default this setting is absent and Dataverse Software assumes it to be false. If enabled, the registration will be performed asynchronously (in the background) during publishing of a dataset.

If you don't want to register file-based PIDs for your installation, set:
It is possible to override the installation-wide setting for specific collections, see :ref:`:AllowEnablingFilePIDsPerCollection <:AllowEnablingFilePIDsPerCollection>`. For example, registration of PIDs for files can be enabled in a specific collection when it is disabled instance-wide. Or it can be disabled in specific collections where it is enabled by default. See :ref:`collection-attributes-api` for details.

To enable file-level PIDs for the entire installation::

``curl -X PUT -d 'true' http://localhost:8080/api/admin/settings/:FilePIDsEnabled``


If you don't want to register file-based PIDs for your entire installation::

``curl -X PUT -d 'false' http://localhost:8080/api/admin/settings/:FilePIDsEnabled``

.. _:AllowEnablingFilePIDsPerCollection:

:AllowEnablingFilePIDsPerCollection
+++++++++++++++++++++++++++++++++++

Toggles whether superusers can change the File PIDs policy per collection. By default this setting is absent and Dataverse Software assumes it to be false.

For example, if this setting is true, registration of PIDs for files can be enabled in a specific collection when it is disabled instance-wide. Or it can be disabled in specific collections where it is enabled by default. See :ref:`collection-attributes-api` for details.

To enable setting file-level PIDs per collection::

``curl -X PUT -d 'true' http://localhost:8080/api/admin/settings/:AllowEnablingFilePIDsPerCollection``


When :AllowEnablingFilePIDsPerCollection is true, setting File PIDs to be enabled/disabled for a given collection can be done via the Native API - see :ref:`collection-attributes-api` in the Native API Guide.

It is possible to override the installation-wide setting for specific collections. For example, registration of PIDs for files can be enabled in a specific collection when it is disabled instance-wide. Or it can be disabled in specific collections where it is enabled by default. See :ref:`collection-attributes-api` for details.

.. _:IndependentHandleService:

Expand Down Expand Up @@ -3469,6 +3490,20 @@ Sets the path where the raw Make Data Count logs are stored before being process

``curl -X PUT -d 'false' http://localhost:8080/api/admin/settings/:DisplayMDCMetrics``

.. _:MDCStartDate:

:MDCStartDate
+++++++++++++

It is possible to display MDC metrics (as of the start date of MDC logging) along with legacy download counts, generated before MDC was enabled.
This is enabled via the new setting `:MDCStartDate` that specifies the cut-over date. If a dataset has any legacy access counts collected prior to that date, those numbers will be displayed in addition to the MDC views and downloads recorded since then.
(Nominally, this date should be when your installation started logging MDC metrics but it can be any date after that if desired.)


``curl -X PUT -d '2019-10-01' http://localhost:8080/api/admin/settings/:MDCStartDate``



.. _:Languages:

:Languages
Expand Down
149 changes: 81 additions & 68 deletions src/main/java/edu/harvard/iq/dataverse/DatasetPage.java
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,7 @@ public void setSelectedHostDataverse(Dataverse selectedHostDataverse) {

private Boolean hasRsyncScript = false;

private Boolean hasTabular = false;
/*private Boolean hasTabular = false;*/


/**
Expand All @@ -355,6 +355,12 @@ public void setSelectedHostDataverse(Dataverse selectedHostDataverse) {
* sometimes you want to know about the current version ("no tabular files
* currently"). Like all files, tabular files can be deleted.
*/
/**
* There doesn't seem to be an actual real life case where we need to know
* if this dataset "has ever had a tabular file" - for all practical purposes
* only the versionHasTabular appears to be in use. I'm going to remove the
* other boolean.
*/
private boolean versionHasTabular = false;

private boolean showIngestSuccess;
Expand Down Expand Up @@ -1881,30 +1887,58 @@ private String init(boolean initFull) {
if (persistentId != null) {
setIdByPersistentId();
}

if (this.getId() != null) {
// Set Working Version and Dataset by Datasaet Id and Version
dataset = datasetService.findDeep(this.getId());

// We are only performing these lookups to obtain the database id
// of the version that we are displaying, and then we will use it
// to perform a .findDeep(versionId); see below.

// TODO: replace the code block below, the combination of
// datasetService.find(id) and datasetVersionService.selectRequestedVersion()
// with some optimized, direct query-based way of obtaining
// the numeric id of the requested DatasetVersion (and that's
// all we need, we are not using any of the entities produced
// below.

dataset = datasetService.find(this.getId());

if (dataset == null) {
logger.warning("No such dataset: "+dataset);
return permissionsWrapper.notFound();
}
//retrieveDatasetVersionResponse = datasetVersionService.retrieveDatasetVersionById(dataset.getId(), version);
retrieveDatasetVersionResponse = datasetVersionService.selectRequestedVersion(dataset.getVersions(), version);
if (retrieveDatasetVersionResponse == null) {
return permissionsWrapper.notFound();
}
this.workingVersion = retrieveDatasetVersionResponse.getDatasetVersion();
logger.fine("retrieved version: id: " + workingVersion.getId() + ", state: " + this.workingVersion.getVersionState());

} else if (versionId != null) {
// TODO: 4.2.1 - this method is broken as of now!
// Set Working Version and Dataset by DatasaetVersion Id
//retrieveDatasetVersionResponse = datasetVersionService.retrieveDatasetVersionByVersionId(versionId);


versionId = workingVersion.getId();

this.workingVersion = null;
this.dataset = null;

}

// ... And now the "real" working version lookup:

if (versionId != null) {
this.workingVersion = datasetVersionService.findDeep(versionId);
dataset = workingVersion.getDataset();
}

if (workingVersion == null) {
logger.warning("Failed to retrieve version");
return permissionsWrapper.notFound();
}

this.maxFileUploadSizeInBytes = systemConfig.getMaxFileUploadSizeForStore(dataset.getEffectiveStorageDriverId());


if (retrieveDatasetVersionResponse == null) {
return permissionsWrapper.notFound();
}


switch (selectTab){
case "dataFilesTab":
Expand All @@ -1921,16 +1955,6 @@ private String init(boolean initFull) {
break;
}

//this.dataset = this.workingVersion.getDataset();

// end: Set the workingVersion and Dataset
// ---------------------------------------
// Is the DatasetVersion or Dataset null?
//
if (workingVersion == null || this.dataset == null) {
return permissionsWrapper.notFound();
}

// Is the Dataset harvested?

if (dataset.isHarvested()) {
Expand Down Expand Up @@ -1958,7 +1982,7 @@ private String init(boolean initFull) {
return permissionsWrapper.notAuthorized();
}

if (!retrieveDatasetVersionResponse.wasRequestedVersionRetrieved()) {
if (retrieveDatasetVersionResponse != null && !retrieveDatasetVersionResponse.wasRequestedVersionRetrieved()) {
//msg("checkit " + retrieveDatasetVersionResponse.getDifferentVersionMessage());
JsfHelper.addWarningMessage(retrieveDatasetVersionResponse.getDifferentVersionMessage());//BundleUtil.getStringFromBundle("dataset.message.metadataSuccess"));
}
Expand Down Expand Up @@ -2105,23 +2129,18 @@ private String init(boolean initFull) {
displayLockInfo(dataset);
displayPublishMessage();

// TODO: replace this loop, and the loop in the method that calculates
// the total "originals" size of the dataset with direct custom queries;
// then we'll be able to drop the lookup hint for DataTable from the
// findDeep() method for the version and further speed up the lookup
// a little bit.
for (FileMetadata fmd : workingVersion.getFileMetadatas()) {
if (fmd.getDataFile().isTabularData()) {
versionHasTabular = true;
break;
}
}
for(DataFile f : dataset.getFiles()) {
// TODO: Consider uncommenting this optimization.
// if (versionHasTabular) {
// hasTabular = true;
// break;
// }
if(f.isTabularData()) {
hasTabular = true;
break;
}
}

//Show ingest success message if refresh forces a page reload after ingest success
//This is needed to display the explore buttons (the fileDownloadHelper needs to be reloaded via page
if (showIngestSuccess) {
Expand Down Expand Up @@ -2405,9 +2424,9 @@ private DefaultTreeNode createFileTreeNode(FileMetadata fileMetadata, TreeNode p
return fileNode;
}

public boolean isHasTabular() {
/*public boolean isHasTabular() {
return hasTabular;
}
}*/

public boolean isVersionHasTabular() {
return versionHasTabular;
Expand Down Expand Up @@ -2844,48 +2863,29 @@ public String refresh() {

//dataset = datasetService.find(dataset.getId());
dataset = null;
workingVersion = null;

logger.fine("refreshing working version");

DatasetVersionServiceBean.RetrieveDatasetVersionResponse retrieveDatasetVersionResponse = null;

if (persistentId != null) {
setIdByPersistentId();
if (this.getId() == null) {
logger.warning("No such dataset: "+persistentId);
return permissionsWrapper.notFound();
}
dataset = datasetService.findDeep(this.getId());
if (dataset == null) {
logger.warning("No such dataset: "+persistentId);
return permissionsWrapper.notFound();
}
retrieveDatasetVersionResponse = datasetVersionService.selectRequestedVersion(dataset.getVersions(), version);
} else if (versionId != null) {
retrieveDatasetVersionResponse = datasetVersionService.retrieveDatasetVersionByVersionId(versionId);
}
if (versionId != null) {
// versionId must have been set by now, in the init() method,
// regardless of how the page was originally called - by the dataset
// database id, by the persistent identifier, or by the db id of
// the version.
this.workingVersion = datasetVersionService.findDeep(versionId);
dataset = workingVersion.getDataset();
}


if (retrieveDatasetVersionResponse == null) {
if (this.workingVersion == null) {
// TODO:
// should probably redirect to the 404 page, if we can't find
// this version anymore.
// -- L.A. 4.2.3
return "";
}
this.workingVersion = retrieveDatasetVersionResponse.getDatasetVersion();

if (this.workingVersion == null) {
// TODO:
// same as the above

return "";
}

if (dataset == null) {
// this would be the case if we were retrieving the version by
// the versionId, above.
this.dataset = this.workingVersion.getDataset();
}

fileMetadatasSearch = selectFileMetadatasForDisplay();

Expand Down Expand Up @@ -3064,19 +3064,32 @@ public void setTooLargeToDownload(boolean tooLargeToDownload) {
this.tooLargeToDownload = tooLargeToDownload;
}

private Long sizeOfDatasetArchival = null;
private Long sizeOfDatasetOriginal = null;


public Long getSizeOfDatasetNumeric() {
if (this.hasTabular){
if (this.versionHasTabular){
return Math.min(getSizeOfDatasetOrigNumeric(), getSizeOfDatasetArchivalNumeric());
}
return getSizeOfDatasetOrigNumeric();
}

public Long getSizeOfDatasetOrigNumeric() {
return DatasetUtil.getDownloadSizeNumeric(workingVersion, true);
if (versionHasTabular) {
if (sizeOfDatasetOriginal == null) {
sizeOfDatasetOriginal = DatasetUtil.getDownloadSizeNumeric(workingVersion, true);
}
return sizeOfDatasetOriginal;
}
return getSizeOfDatasetArchivalNumeric();
}

public Long getSizeOfDatasetArchivalNumeric() {
return DatasetUtil.getDownloadSizeNumeric(workingVersion, false);
if (sizeOfDatasetArchival == null) {
sizeOfDatasetArchival = DatasetUtil.getDownloadSizeNumeric(workingVersion, false);
}
return sizeOfDatasetArchival;
}

public String getSizeOfSelectedAsString(){
Expand Down
Loading

0 comments on commit a8a367a

Please sign in to comment.