Skip to content

Commit

Permalink
Merge branch 'develop' into 8889-filepids-in-collections
Browse files Browse the repository at this point in the history
  • Loading branch information
landreev committed May 24, 2023
2 parents 78d68e2 + bef00db commit 2fd36ad
Show file tree
Hide file tree
Showing 46 changed files with 817 additions and 55 deletions.
2 changes: 1 addition & 1 deletion conf/keycloak/oidc-keycloak-auth-provider.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@
"factoryAlias": "oidc",
"title": "OIDC-Keycloak",
"subtitle": "OIDC-Keycloak",
"factoryData": "type: oidc | issuer: http://localhost:8090/auth/realms/oidc-realm | clientId: oidc-client | clientSecret: ss6gE8mODCDfqesQaSG3gwUwZqZt547E",
"factoryData": "type: oidc | issuer: http://keycloak.mydomain.com:8090/realms/oidc-realm | clientId: oidc-client | clientSecret: ss6gE8mODCDfqesQaSG3gwUwZqZt547E",
"enabled": true
}
5 changes: 5 additions & 0 deletions conf/solr/8.11.1/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,9 @@
<field name="geolocation" type="location_rpt" multiValued="true" stored="true" indexed="true"/>
<!-- https://solr.apache.org/guide/8_11/spatial-search.html#bboxfield -->
<field name="boundingBox" type="bbox" multiValued="true" stored="true" indexed="true"/>

<!-- incomplete datasets issue 8822 -->
<field name="datasetValid" type="boolean" stored="true" indexed="true" multiValued="false"/>

<!--
METADATA SCHEMA FIELDS
Expand Down Expand Up @@ -470,6 +473,8 @@
<!-- <copyField source="*_ss" dest="_text_" maxChars="3000"/> -->
<!-- <copyField source="*_i" dest="_text_" maxChars="3000"/> -->

<copyField source="datasetValid" dest="_text_" maxChars="3000"/>

<!--
METADATA SCHEMA FIELDS
Now following: copyFields to copy the contents of the metadata fields above to a
Expand Down
14 changes: 14 additions & 0 deletions doc/release-notes/8822-incomplete-datasets-via-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
### Creating datasets with incomplete metadata through API

The create dataset API call (POST to /api/dataverses/#dataverseId/datasets) is extended with the "doNotValidate" parameter. However, in order to be able to create a dataset with incomplete metadata, the solr configuration must be updated first with the new "schema.xml" file (do not forget to run the metadata fields update script when you use custom metadata). Reindexing is optional, but recommended. Also, even when this feature is not used, it is recommended to update the solar configuration and reindex the metadata. Finally, this new feature can be activated with the "dataverse.api.allow-incomplete-metadata" JVM option.

You can also enable a valid/incomplete metadata filter in the "My Data" page using the "dataverse.ui.show-validity-filter" JVM option. By default, this filter is not shown. When you wish to use this filter, you must reindex the datasets first, otherwise datasets with valid metadata will not be shown in the results.

It is not possible to publish datasets with incomplete or incomplete metadata. By default, you also cannot send such datasets for review. If you wish to enable sending for review of datasets with incomplete metadata, turn on the "dataverse.ui.allow-review-for-incomplete" JVM option.

In order to customize the wording and add translations to the UI sections extended by this feature, you can edit the "Bundle.properties" file and the localized versions of that file. The property keys used by this feature are:
- incomplete
- valid
- dataset.message.incomplete.warning
- mydataFragment.validity
- dataverses.api.create.dataset.error.mustIncludeAuthorName
1 change: 1 addition & 0 deletions doc/release-notes/9229-bearer-api-auth.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
A feature flag called "api-bearer-auth" has been added. This allows OIDC useraccounts to send authenticated API requests using Bearer Tokens. Note: This feature is limited to OIDC! For more information, see http://preview.guides.gdcc.io/en/develop/installation/config.html#feature-flags
17 changes: 17 additions & 0 deletions doc/sphinx-guides/source/api/auth.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,20 @@ Resetting Your API Token
------------------------

You can reset your API Token from your account page in your Dataverse installation as described in the :doc:`/user/account` section of the User Guide.

.. _bearer-tokens:

Bearer Tokens
-------------

Bearer tokens are defined in `RFC 6750`_ and can be used as an alternative to API tokens if your installation has been set up to use them (see :ref:`bearer-token-auth` in the Installation Guide).

.. _RFC 6750: https://tools.ietf.org/html/rfc6750

To test if bearer tokens are working, you can try something like the following (using the :ref:`User Information` API endpoint), substituting in parameters for your installation and user.

.. code-block:: bash
export TOKEN=`curl -s -X POST --location "http://keycloak.mydomain.com:8090/realms/oidc-realm/protocol/openid-connect/token" -H "Content-Type: application/x-www-form-urlencoded" -d "username=kcuser&password=kcpassword&grant_type=password&client_id=oidc-client&client_secret=ss6gE8mODCDfqesQaSG3gwUwZqZt547E" | jq '.access_token' -r | tr -d "\n"`
curl -H "Authorization: Bearer $TOKEN" http://localhost:8080/api/users/:me
75 changes: 75 additions & 0 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -526,6 +526,60 @@ To create a dataset, you must supply a JSON file that contains at least the foll
- Description Text
- Subject

Submit Incomplete Dataset
^^^^^^^^^^^^^^^^^^^^^^^^^

**Note:** This feature requires :ref:`dataverse.api.allow-incomplete-metadata` to be enabled and your Solr
Schema to be up-to-date with the ``datasetValid`` field.

Providing a ``.../datasets?doNotValidate=true`` query parameter turns off the validation of metadata.
In this case, only the "Author Name" is required. For example, a minimal JSON file would look like this:

.. code-block:: json
:name: dataset-incomplete.json
{
"datasetVersion": {
"metadataBlocks": {
"citation": {
"fields": [
{
"value": [
{
"authorName": {
"value": "Finch, Fiona",
"typeClass": "primitive",
"multiple": false,
"typeName": "authorName"
}
}
],
"typeClass": "compound",
"multiple": true,
"typeName": "author"
}
],
"displayName": "Citation Metadata"
}
}
}
}
The following is an example HTTP call with deactivated validation:

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export PARENT=root
export SERVER_URL=https://demo.dataverse.org
curl -H X-Dataverse-key:$API_TOKEN -X POST "$SERVER_URL/api/dataverses/$PARENT/datasets?doNotValidate=true" --upload-file dataset-incomplete.json -H 'Content-type:application/json'
**Note:** You may learn about an instance's support for deposition of incomplete datasets via :ref:`info-incomplete-metadata`.

Submit Dataset
^^^^^^^^^^^^^^

As a starting point, you can download :download:`dataset-finch1.json <../../../../scripts/search/tests/data/dataset-finch1.json>` and modify it to meet your needs. (:download:`dataset-finch1_fr.json <../../../../scripts/api/data/dataset-finch1_fr.json>` is a variant of this file that includes setting the metadata language (see :ref:`:MetadataLanguages`) to French (fr). In addition to this minimal example, you can download :download:`dataset-create-new-all-default-fields.json <../../../../scripts/api/data/dataset-create-new-all-default-fields.json>` which populates all of the metadata fields that ship with a Dataverse installation.)

The curl command below assumes you have kept the name "dataset-finch1.json" and that this file is in your current working directory.
Expand Down Expand Up @@ -3209,6 +3263,27 @@ The fully expanded example above (without environment variables) looks like this
curl https://demo.dataverse.org/api/info/apiTermsOfUse
.. _info-incomplete-metadata:

Show Support Of Incomplete Metadata Deposition
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Learn if an instance has been configured to allow deposition of incomplete datasets via the API.
See also :ref:`create-dataset-command` and :ref:`dataverse.api.allow-incomplete-metadata`

.. code-block:: bash
export SERVER_URL=https://demo.dataverse.org
curl $SERVER_URL/api/info/settings/incompleteMetadataViaApi
The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash
curl https://demo.dataverse.org/api/info/settings/incompleteMetadataViaApi
.. _metadata-blocks-api:

Metadata Blocks
Expand Down
4 changes: 4 additions & 0 deletions doc/sphinx-guides/source/developers/remote-users.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,13 @@ Now when you go to http://localhost:8080/oauth2/firstLogin.xhtml you should be p

----

.. _oidc-dev:

OpenID Connect (OIDC)
---------------------

STOP! ``oidc-keycloak-auth-provider.json`` was changed from http://localhost:8090 to http://keycloak.mydomain.com:8090 to test :ref:`bearer-tokens`. In addition, ``docker-compose-dev.yml`` in the root of the repo was updated to start up Keycloak. To use these, you should add ``127.0.0.1 keycloak.mydomain.com`` to your ``/etc/hosts file``. If you'd like to use the docker compose as described below (``conf/keycloak/docker-compose.yml``), you should revert the change to ``oidc-keycloak-auth-provider.json``.

If you are working on the OpenID Connect (OIDC) user authentication flow, you do not need to connect to a remote provider (as explained in :doc:`/installation/oidc`) to test this feature. Instead, you can use the available configuration that allows you to run a test Keycloak OIDC identity management service locally through a Docker container.

(Please note! The client secret (``ss6gE8mODCDfqesQaSG3gwUwZqZt547E``) is hard-coded in ``oidc-realm.json`` and ``oidc-keycloak-auth-provider.json``. Do not use this config in production! This is only for developers.)
Expand Down
56 changes: 54 additions & 2 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,19 @@ As for the "Remote only" authentication mode, it means that:
- ``:DefaultAuthProvider`` has been set to use the desired authentication provider
- The "builtin" authentication provider has been disabled (:ref:`api-toggle-auth-provider`). Note that disabling the "builtin" authentication provider means that the API endpoint for converting an account from a remote auth provider will not work. Converting directly from one remote authentication provider to another (i.e. from GitHub to Google) is not supported. Conversion from remote is always to "builtin". Then the user initiates a conversion from "builtin" to remote. Note that longer term, the plan is to permit multiple login options to the same Dataverse installation account per https://github.com/IQSS/dataverse/issues/3487 (so all this talk of conversion will be moot) but for now users can only use a single login option, as explained in the :doc:`/user/account` section of the User Guide. In short, "remote only" might work for you if you only plan to use a single remote authentication provider such that no conversion between remote authentication providers will be necessary.

.. _bearer-token-auth:

Bearer Token Authentication
---------------------------

Bearer tokens are defined in `RFC 6750`_ and can be used as an alternative to API tokens. This is an experimental feature hidden behind a feature flag.

.. _RFC 6750: https://tools.ietf.org/html/rfc6750

To enable bearer tokens, you must install and configure Keycloak (for now, see :ref:`oidc-dev` in the Developer Guide) and enable ``api-bearer-auth`` under :ref:`feature-flags`.

You can test that bearer tokens are working by following the example under :ref:`bearer-tokens` in the API Guide.

.. _database-persistence:

Database Persistence
Expand Down Expand Up @@ -2331,6 +2344,19 @@ Can also be set via any `supported MicroProfile Config API source`_, e.g. the en
**WARNING:** For security, do not use the sources "environment variable" or "system property" (JVM option) in a
production context! Rely on password alias, secrets directory or cloud based sources instead!

.. _dataverse.api.allow-incomplete-metadata:

dataverse.api.allow-incomplete-metadata
+++++++++++++++++++++++++++++++++++++++

When enabled, dataset with incomplete metadata can be submitted via API for later corrections.
See :ref:`create-dataset-command` for details.

Defaults to ``false``.

Can also be set via any `supported MicroProfile Config API source`_, e.g. the environment variable
``DATAVERSE_API_ALLOW_INCOMPLETE_METADATA``. Will accept ``[tT][rR][uU][eE]|1|[oO][nN]`` as "true" expressions.

.. _dataverse.signposting.level1-author-limit:

dataverse.signposting.level1-author-limit
Expand Down Expand Up @@ -2370,6 +2396,29 @@ The default is false.

Can also be set via *MicroProfile Config API* sources, e.g. the environment variable ``DATAVERSE_MAIL_CC_SUPPORT_ON_CONTACT_EMAIL``.

dataverse.ui.allow-review-for-incomplete
++++++++++++++++++++++++++++++++++++++++

Determines if dataset submitted via API with incomplete metadata (for later corrections) can be submitted for review
from the UI.

Defaults to ``false``.

Can also be set via any `supported MicroProfile Config API source`_, e.g. the environment variable
``DATAVERSE_UI_ALLOW_REVIEW_FOR_INCOMPLETE``. Will accept ``[tT][rR][uU][eE]|1|[oO][nN]`` as "true" expressions.

dataverse.ui.show-validity-filter
+++++++++++++++++++++++++++++++++

When enabled, the filter for validity of metadata is shown in "My Data" page.
**Note:** When you wish to use this filter, you must reindex the datasets first, otherwise datasets with valid metadata
will not be shown in the results.

Defaults to ``false``.

Can also be set via any `supported MicroProfile Config API source`_, e.g. the environment variable
``DATAVERSE_UI_SHOW_VALIDITY_FILTER``. Will accept ``[tT][rR][uU][eE]|1|[oO][nN]`` as "true" expressions.


.. _feature-flags:

Expand All @@ -2391,6 +2440,9 @@ please find all known feature flags below. Any of these flags can be activated u
* - api-session-auth
- Enables API authentication via session cookie (JSESSIONID). **Caution: Enabling this feature flag exposes the installation to CSRF risks!** We expect this feature flag to be temporary (only used by frontend developers, see `#9063 <https://github.com/IQSS/dataverse/issues/9063>`_) and removed once support for bearer tokens has been implemented (see `#9229 <https://github.com/IQSS/dataverse/issues/9229>`_).
- ``Off``
* - api-bearer-auth
- Enables API authentication via Bearer Token for OIDC User Accounts. **Information: This feature works only for OIDC UserAccounts!**
- ``Off``

**Note:** Feature flags can be set via any `supported MicroProfile Config API source`_, e.g. the environment variable
``DATAVERSE_FEATURE_XXX`` (e.g. ``DATAVERSE_FEATURE_API_SESSION_AUTH=1``). These environment variables can be set in your shell before starting Payara. If you are using :doc:`Docker for development </container/dev-usage>`, you can set them in the `docker compose <https://docs.docker.com/compose/environment-variables/set-environment-variables/>`_ file.
Expand Down Expand Up @@ -3839,7 +3891,7 @@ To use the current GDCC version directly:
:CategoryOrder
++++++++++++++

A comma separated list of Category/Tag names defining the order in which files with those tags should be displayed.
A comma separated list of Category/Tag names defining the order in which files with those tags should be displayed.
The setting can include custom tag names along with the pre-defined tags(Documentation, Data, and Code are the defaults but the :ref:`:FileCategories` setting can be used to use a different set of tags).
The default is category ordering disabled.

Expand All @@ -3851,7 +3903,7 @@ A true(default)/false option determining whether datafiles listed on the dataset
:AllowUserManagementOfOrder
+++++++++++++++++++++++++++

A true/false (default) option determining whether the dataset datafile table display includes checkboxes enabling users to turn folder ordering and/or category ordering (if an order is defined by :CategoryOrder) on and off dynamically.
A true/false (default) option determining whether the dataset datafile table display includes checkboxes enabling users to turn folder ordering and/or category ordering (if an order is defined by :CategoryOrder) on and off dynamically.

.. _supported MicroProfile Config API source: https://docs.payara.fish/community/docs/Technical%20Documentation/MicroProfile/Config/Overview.html

2 changes: 2 additions & 0 deletions doc/sphinx-guides/source/user/account.rst
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,8 @@ Microsoft Azure AD, GitHub, and Google Log In

You can also convert your Dataverse installation account to use authentication provided by GitHub, Microsoft, or Google. These options may be found in the "Other options" section of the log in page, and function similarly to how ORCID is outlined above. If you would like to convert your account away from using one of these services for log in, then you can follow the same steps as listed above for converting away from the ORCID log in.

.. _my-data:

My Data
-------

Expand Down
20 changes: 20 additions & 0 deletions docker-compose-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ services:
- DATAVERSE_DB_HOST=postgres
- DATAVERSE_DB_PASSWORD=secret
- DATAVERSE_DB_USER=${DATAVERSE_DB_USER}
- DATAVERSE_FEATURE_API_BEARER_AUTH=1
ports:
- "8080:8080" # HTTP (Dataverse Application)
- "4848:4848" # HTTP (Payara Admin Console)
Expand Down Expand Up @@ -98,6 +99,25 @@ services:
tmpfs:
- /mail:mode=770,size=128M,uid=1000,gid=1000

dev_keycloak:
container_name: "dev_keycloack"
image: 'quay.io/keycloak/keycloak:19.0'
hostname: keycloak
environment:
- KEYCLOAK_ADMIN=kcadmin
- KEYCLOAK_ADMIN_PASSWORD=kcpassword
- KEYCLOAK_LOGLEVEL=DEBUG
- KC_HOSTNAME_STRICT=false
networks:
dataverse:
aliases:
- keycloak.mydomain.com #create a DNS alias within the network (add the same alias to your /etc/hosts to get a working OIDC flow)
command: start-dev --import-realm --http-port=8090 # change port to 8090, so within the network and external the same port is used
ports:
- "8090:8090"
volumes:
- './conf/keycloak/oidc-realm.json:/opt/keycloak/data/import/oidc-realm.json'

networks:
dataverse:
driver: bridge
7 changes: 6 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/Dataset.java
Original file line number Diff line number Diff line change
Expand Up @@ -881,7 +881,12 @@ public <T> T accept(Visitor<T> v) {
@Override
public String getDisplayName() {
DatasetVersion dsv = getReleasedVersion();
return dsv != null ? dsv.getTitle() : getLatestVersion().getTitle();
String result = dsv != null ? dsv.getTitle() : getLatestVersion().getTitle();
boolean resultIsEmpty = result == null || "".equals(result);
if (resultIsEmpty && getGlobalId() != null) {
return getGlobalId().asString();
}
return result;
}

@Override
Expand Down
16 changes: 15 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/DatasetPage.java
Original file line number Diff line number Diff line change
Expand Up @@ -2168,10 +2168,24 @@ private void displayPublishMessage(){
if (workingVersion.isDraft() && workingVersion.getId() != null && canUpdateDataset()
&& !dataset.isLockedFor(DatasetLock.Reason.finalizePublication)
&& (canPublishDataset() || !dataset.isLockedFor(DatasetLock.Reason.InReview) )){
JsfHelper.addWarningMessage(datasetService.getReminderString(dataset, canPublishDataset()));
JsfHelper.addWarningMessage(datasetService.getReminderString(dataset, canPublishDataset(), false, isValid()));
}
}

public boolean isValid() {
DatasetVersion version = dataset.getLatestVersion();
if (!version.isDraft()) {
return true;
}
DatasetVersion newVersion = version.cloneDatasetVersion();
newVersion.setDatasetFields(newVersion.initDatasetFields());
return newVersion.isValid();
}

public boolean isValidOrCanReviewIncomplete() {
return isValid() || JvmSettings.UI_ALLOW_REVIEW_INCOMPLETE.lookupOptional(Boolean.class).orElse(false);
}

private void displayLockInfo(Dataset dataset) {
// Various info messages, when the dataset is locked (for various reasons):
if (dataset.isLocked() && canUpdateDataset()) {
Expand Down
Loading

0 comments on commit 2fd36ad

Please sign in to comment.