Skip to content

Commit

Permalink
Merge branch 'develop' into 9670-5.14-release-notes
Browse files Browse the repository at this point in the history
  • Loading branch information
sekmiller committed Aug 1, 2023
2 parents 699e77c + 54fd71e commit 9793336
Show file tree
Hide file tree
Showing 25 changed files with 403 additions and 211 deletions.
2 changes: 1 addition & 1 deletion conf/docker-aio/readme.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Docker All-In-One

> :information_source: **NOTE: Sunsetting of this module is imminent.** There is no schedule yet, but expect it to go away.
> Please let the [Dataverse Containerization Working Group](https://dc.wgs.gdcc.io) know if you are a user and
> Please let the [Dataverse Containerization Working Group](https://ct.gdcc.io) know if you are a user and
> what should be preserved.
First pass docker all-in-one image, intended for running integration tests against.
Expand Down
15 changes: 15 additions & 0 deletions doc/release-notes/8889-2-filepids-in-collections-changes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
The default for whether PIDs are registered for files or not is now false.

Installations where file PIDs were enabled by default will have to add the :FilePIDsEnabled = true setting to maintain the existing functionality.

Add step to install:

If your installation did not have :FilePIDsEnabled set, you will need to set it to true to keep file PIDs enabled:

curl -X PUT -d 'true' http://localhost:8080/api/admin/settings/:FilePIDsEnabled



It is now possible to allow File PIDs to be enabled/disabled per collection. See the [:AllowEnablingFilePIDsPerCollection](https://guides.dataverse.org/en/latest/installation/config.html#allowenablingfilepidspercollection) section of the Configuration guide for details.

For example, registration of PIDs for files can now be enabled in a specific collection when it is disabled instance-wide. Or it can be disabled in specific collections where it is enabled by default.
1 change: 1 addition & 0 deletions doc/release-notes/9704-GeotifShp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Two file previewers for GeoTIFF and Shapefiles are now available for visualizing geotiff image files and zipped Shapefiles on a map.
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ Tool Type Scope Description
Data Explorer explore file A GUI which lists the variables in a tabular data file allowing searching, charting and cross tabulation analysis. See the README.md file at https://github.com/scholarsportal/dataverse-data-explorer-v2 for the instructions on adding Data Explorer to your Dataverse.
Whole Tale explore dataset A platform for the creation of reproducible research packages that allows users to launch containerized interactive analysis environments based on popular tools such as Jupyter and RStudio. Using this integration, Dataverse users can launch Jupyter and RStudio environments to analyze published datasets. For more information, see the `Whole Tale User Guide <https://wholetale.readthedocs.io/en/stable/users_guide/integration.html>`_.
Binder explore dataset Binder allows you to spin up custom computing environments in the cloud (including Jupyter notebooks) with the files from your dataset. `Installation instructions <https://github.com/data-exp-lab/girder_ythub/issues/10>`_ are in the Data Exploration Lab girder_ythub project. See also :ref:`binder`.
File Previewers explore file A set of tools that display the content of files - including audio, html, `Hypothes.is <https://hypothes.is/>`_ annotations, images, PDF, text, video, tabular data, spreadsheets, GeoJSON, zip, HDF5, NetCDF, and NcML files - allowing them to be viewed without downloading the file. The previewers can be run directly from github.io, so the only required step is using the Dataverse API to register the ones you want to use. Documentation, including how to optionally brand the previewers, and an invitation to contribute through github are in the README.md file. Initial development was led by the Qualitative Data Repository and the spreasdheet previewer was added by the Social Sciences and Humanities Open Cloud (SSHOC) project. https://github.com/gdcc/dataverse-previewers
File Previewers explore file A set of tools that display the content of files - including audio, html, `Hypothes.is <https://hypothes.is/>`_ annotations, images, PDF, text, video, tabular data, spreadsheets, GeoJSON, GeoTIFF, Shapefile, zip, HDF5, NetCDF, and NcML files - allowing them to be viewed without downloading the file. The previewers can be run directly from github.io, so the only required step is using the Dataverse API to register the ones you want to use. Documentation, including how to optionally brand the previewers, and an invitation to contribute through github are in the README.md file. Initial development was led by the Qualitative Data Repository and the spreasdheet previewer was added by the Social Sciences and Humanities Open Cloud (SSHOC) project. https://github.com/gdcc/dataverse-previewers
Data Curation Tool configure file A GUI for curating data by adding labels, groups, weights and other details to assist with informed reuse. See the README.md file at https://github.com/scholarsportal/Dataverse-Data-Curation-Tool for the installation instructions.
7 changes: 5 additions & 2 deletions doc/sphinx-guides/source/admin/dataverses-datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,8 @@ In the following example, the database id of the file is 42::

export FILE_ID=42
curl "http://localhost:8080/api/admin/$FILE_ID/registerDataFile"
This method will return a FORBIDDEN response if minting of file PIDs is not enabled for the collection the file is in. (Note that it is possible to have file PIDs enabled for a specific collection, even when it is disabled for the Dataverse installation as a whole. See :ref:`collection-attributes-api` in the Native API Guide.)

Mint PIDs for all unregistered published files in the specified collection
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -162,7 +164,8 @@ The following API will register the PIDs for all the yet unregistered published

curl "http://localhost:8080/api/admin/registerDataFiles/{collection_alias}"

It will not attempt to register the datafiles in its sub-collections, so this call will need to be repeated on any sub-collections where files need to be registered as well. File-level PID registration must be enabled on the collection. (Note that it is possible to have it enabled for a specific collection, even when it is disabled for the Dataverse installation as a whole. See :ref:`collection-attributes-api` in the Native API Guide.)
It will not attempt to register the datafiles in its sub-collections, so this call will need to be repeated on any sub-collections where files need to be registered as well.
File-level PID registration must be enabled on the collection. (Note that it is possible to have it enabled for a specific collection, even when it is disabled for the Dataverse installation as a whole. See :ref:`collection-attributes-api` in the Native API Guide.)

This API will sleep for 1 second between registration calls by default. A longer sleep interval can be specified with an optional ``sleep=`` parameter::

Expand All @@ -171,7 +174,7 @@ This API will sleep for 1 second between registration calls by default. A longer
Mint PIDs for ALL unregistered files in the database
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The following API will attempt to register the PIDs for all the published files in your instance that do not yet have them::
The following API will attempt to register the PIDs for all the published files in your instance, in collections that allow file PIDs, that do not yet have them::

curl http://localhost:8080/api/admin/registerDataFileAll

Expand Down
2 changes: 2 additions & 0 deletions doc/sphinx-guides/source/admin/make-data-count.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ Enable or Disable Display of Make Data Count Metrics

By default, when MDC logging is enabled (when ``:MDCLogPath`` is set), your Dataverse installation will display MDC metrics instead of it's internal (legacy) metrics. You can avoid this (e.g. to collect MDC metrics for some period of time before starting to display them) by setting ``:DisplayMDCMetrics`` to false.

You can also decide to display MDC metrics along with Dataverse's traditional download counts from the time before MDC was enabled. To do this, set the :ref:`:MDCStartDate` to when you started MDC logging.

The following discussion assumes ``:MDCLogPath`` has been set to ``/usr/local/payara5/glassfish/domains/domain1/logs/mdc``

Configure Counter Processor
Expand Down
55 changes: 26 additions & 29 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,8 @@ The fully expanded example above (without environment variables) looks like this
curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -X POST -H "Content-type:application/json" https://demo.dataverse.org/api/dataverses/root/metadatablockfacets/isRoot -d 'true'
.. _create-role-in-collection:

Create a New Role in a Dataverse Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand All @@ -317,16 +319,7 @@ The fully expanded example above (without environment variables) looks like this
curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -X POST -H "Content-type:application/json" https://demo.dataverse.org/api/dataverses/root/roles --upload-file roles.json
Where ``roles.json`` looks like this::

{
"alias": "sys1",
"name": “Restricted System Role”,
"description": “A person who may only add datasets.”,
"permissions": [
"AddDataset"
]
}
For ``roles.json`` see :ref:`json-representation-of-a-role`

.. note:: Only a Dataverse installation account with superuser permissions is allowed to create roles in a Dataverse Collection.

Expand Down Expand Up @@ -753,7 +746,7 @@ The following attributes are supported:
* ``name`` Name
* ``description`` Description
* ``affiliation`` Affiliation
* ``filePIDsEnabled`` ("true" or "false") Enables or disables registration of file-level PIDs in datasets within the collection (overriding the instance-wide setting).
* ``filePIDsEnabled`` ("true" or "false") Restricted to use by superusers and only when the :ref:`:AllowEnablingFilePIDsPerCollection <:AllowEnablingFilePIDsPerCollection>` setting is true. Enables or disables registration of file-level PIDs in datasets within the collection (overriding the instance-wide setting).


Datasets
Expand Down Expand Up @@ -3094,26 +3087,14 @@ Optionally, you may use a third query parameter "sendEmailNotification=false" to
Roles
-----
Create a New Role in a Dataverse Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Creates a new role under Dataverse collection ``id``. Needs a json file with the role description:
.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=root
curl -H X-Dataverse-key:$API_TOKEN -X POST -H "Content-type:application/json" $SERVER_URL/api/dataverses/$ID/roles --upload-file roles.json
A role is a set of permissions.
The fully expanded example above (without environment variables) looks like this:
.. code-block:: bash
.. _json-representation-of-a-role:
curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -X POST -H "Content-type:application/json" https://demo.dataverse.org/api/dataverses/root/roles --upload-file roles.json
JSON Representation of a Role
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Where ``roles.json`` looks like this::
The JSON representation of a role (``roles.json``) looks like this::
{
"alias": "sys1",
Expand All @@ -3124,8 +3105,12 @@ Where ``roles.json`` looks like this::
]
}
.. note:: Only a Dataverse installation account with superuser permissions is allowed to create roles in a Dataverse Collection.
.. note:: alias is constrained to a length of 16 characters
Create Role
~~~~~~~~~~~
Roles can be created globally (:ref:`create-global-role`) or for individual Dataverse collections (:ref:`create-role-in-collection`).
Show Role
~~~~~~~~~
Expand Down Expand Up @@ -3986,12 +3971,24 @@ List all global roles in the system. ::

GET http://$SERVER/api/admin/roles

.. _create-global-role:

Create Global Role
~~~~~~~~~~~~~~~~~~

Creates a global role in the Dataverse installation. The data POSTed are assumed to be a role JSON. ::

POST http://$SERVER/api/admin/roles

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=root
curl -H X-Dataverse-key:$API_TOKEN -X POST $SERVER_URL/api/admin/roles --upload-file roles.json
``roles.json`` see :ref:`json-representation-of-a-role`

Delete Global Role
~~~~~~~~~~~~~~~~~~
Expand Down
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/developers/testing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ Running the full API test suite using Docker

.. note::
Sunsetting of this module is imminent.** There is no schedule yet, but expect it to go away.
Please let the `Dataverse Containerization Working Group <https://dc.wgs.gdcc.io>`_ know if you are a user and
Please let the `Dataverse Containerization Working Group <https://ct.gdcc.io>`_ know if you are a user and
what should be preserved.

To run the full suite of integration tests on your laptop, we recommend using the "all in one" Docker configuration described in ``conf/docker-aio/readme.md`` in the root of the repo.
Expand Down
51 changes: 46 additions & 5 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,12 @@ It is very important to keep the block in place for the "admin" endpoint, and to

It's also possible to prevent file uploads via API by adjusting the :ref:`:UploadMethods` database setting.

If you are using a load balancer or a reverse proxy, there are some additional considerations. If no additional configurations are made and the upstream is configured to redirect to localhost, the API will be accessible from the outside, as your installation will register as origin the localhost for any requests to the endpoints "admin" and "builtin-users". To prevent this, you have two options:

- If your upstream is configured to redirect to localhost, you will need to set the :ref:`JVM option <useripaddresssourceheader>` to one of the following values ``%client.name% %datetime% %request% %status% %response.length% %header.referer% %header.x-forwarded-for%`` and configure from the load balancer side the chosen header to populate with the client IP address.

- Another solution is to set the upstream to the client IP address. In this case no further configuration is needed.

Forcing HTTPS
+++++++++++++

Expand Down Expand Up @@ -248,7 +254,7 @@ this provider.
- :ref:`:Shoulder <:Shoulder>`
- :ref:`:IdentifierGenerationStyle <:IdentifierGenerationStyle>` (optional)
- :ref:`:DataFilePIDFormat <:DataFilePIDFormat>` (optional)
- :ref:`:FilePIDsEnabled <:FilePIDsEnabled>` (optional, defaults to true)
- :ref:`:FilePIDsEnabled <:FilePIDsEnabled>` (optional, defaults to false)

.. _pids-handle-configuration:

Expand Down Expand Up @@ -297,7 +303,7 @@ Here are the configuration options for PermaLinks:
- :ref:`:Shoulder <:Shoulder>`
- :ref:`:IdentifierGenerationStyle <:IdentifierGenerationStyle>` (optional)
- :ref:`:DataFilePIDFormat <:DataFilePIDFormat>` (optional)
- :ref:`:FilePIDsEnabled <:FilePIDsEnabled>` (optional, defaults to true)
- :ref:`:FilePIDsEnabled <:FilePIDsEnabled>` (optional, defaults to false)

.. _auth-modes:

Expand Down Expand Up @@ -2775,14 +2781,35 @@ timestamps.
:FilePIDsEnabled
++++++++++++++++

Toggles publishing of file-level PIDs for the entire installation. By default this setting is absent and Dataverse Software assumes it to be true. If enabled, the registration will be performed asynchronously (in the background) during publishing of a dataset.
Toggles publishing of file-level PIDs for the entire installation. By default this setting is absent and Dataverse Software assumes it to be false. If enabled, the registration will be performed asynchronously (in the background) during publishing of a dataset.

It is possible to override the installation-wide setting for specific collections, see :ref:`:AllowEnablingFilePIDsPerCollection <:AllowEnablingFilePIDsPerCollection>`. For example, registration of PIDs for files can be enabled in a specific collection when it is disabled instance-wide. Or it can be disabled in specific collections where it is enabled by default. See :ref:`collection-attributes-api` for details.

To enable file-level PIDs for the entire installation::

``curl -X PUT -d 'true' http://localhost:8080/api/admin/settings/:FilePIDsEnabled``

If you don't want to register file-based PIDs for your installation, set:

If you don't want to register file-based PIDs for your entire installation::

``curl -X PUT -d 'false' http://localhost:8080/api/admin/settings/:FilePIDsEnabled``

.. _:AllowEnablingFilePIDsPerCollection:

:AllowEnablingFilePIDsPerCollection
+++++++++++++++++++++++++++++++++++

Toggles whether superusers can change the File PIDs policy per collection. By default this setting is absent and Dataverse Software assumes it to be false.

For example, if this setting is true, registration of PIDs for files can be enabled in a specific collection when it is disabled instance-wide. Or it can be disabled in specific collections where it is enabled by default. See :ref:`collection-attributes-api` for details.

To enable setting file-level PIDs per collection::

``curl -X PUT -d 'true' http://localhost:8080/api/admin/settings/:AllowEnablingFilePIDsPerCollection``


When :AllowEnablingFilePIDsPerCollection is true, setting File PIDs to be enabled/disabled for a given collection can be done via the Native API - see :ref:`collection-attributes-api` in the Native API Guide.

It is possible to override the installation-wide setting for specific collections. For example, registration of PIDs for files can be enabled in a specific collection when it is disabled instance-wide. Or it can be disabled in specific collections where it is enabled by default. See :ref:`collection-attributes-api` for details.

.. _:IndependentHandleService:

Expand Down Expand Up @@ -3469,6 +3496,20 @@ Sets the path where the raw Make Data Count logs are stored before being process

``curl -X PUT -d 'false' http://localhost:8080/api/admin/settings/:DisplayMDCMetrics``

.. _:MDCStartDate:

:MDCStartDate
+++++++++++++

It is possible to display MDC metrics (as of the start date of MDC logging) along with legacy download counts, generated before MDC was enabled.
This is enabled via the new setting `:MDCStartDate` that specifies the cut-over date. If a dataset has any legacy access counts collected prior to that date, those numbers will be displayed in addition to the MDC views and downloads recorded since then.
(Nominally, this date should be when your installation started logging MDC metrics but it can be any date after that if desired.)


``curl -X PUT -d '2019-10-01' http://localhost:8080/api/admin/settings/:MDCStartDate``



.. _:Languages:

:Languages
Expand Down
16 changes: 16 additions & 0 deletions doc/sphinx-guides/source/user/dataset-management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,8 @@ Previewers are available for the following file types:
- Zip (preview and extract/download)
- HTML
- GeoJSON
- GeoTIFF
- Shapefile
- NetCDF/HDF5
- Hypothes.is

Expand Down Expand Up @@ -343,6 +345,20 @@ GeoJSON

A map will be shown as a preview of GeoJSON files when the previewer has been enabled (see :ref:`file-previews`). See also a `video demo <https://www.youtube.com/watch?v=EACJJaV3O1c&t=588s>`_ of the GeoJSON previewer by its author, Kaitlin Newson.

.. _geotiff:

GeoTIFF
-------

A map is also displayed as a preview of GeoTiFF image files, whose previewer must be enabled (see :ref:`file-previews`). Since GeoTIFFs do not have their own mimetype, it is advisable to use this previewer only when GeoTIFFs are used (and not "normal" TIFs). For performance reasons, this previewer has a file size limit of 15 MB and a row/column limit of 50,000 so that larger files are not loaded.

.. _shapefile:

Shapefile
---------

Another previewer can be enabled for shapefiles (see :ref:`file-previews`). This previewer only works with zipped shapefiles (see :doc:`/developers/geospatial`). A file size limit of 20 MB is set for this previewer (also because of performance reasons).

.. _netcdf-and-hdf5:

NetCDF and HDF5
Expand Down
Loading

0 comments on commit 9793336

Please sign in to comment.