Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: curation API for datasets #3708

Merged
merged 12 commits into from
Dec 15, 2022

Conversation

ebezzi
Copy link
Member

@ebezzi ebezzi commented Dec 9, 2022

No description provided.

@@ -48,7 +49,7 @@
)


class PortalApi:
class PortalApi(ApiCommon):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore this for now

@@ -87,6 +87,14 @@ def get_cxguser_token(user="owner"):
class BaseAuthAPITest(BaseAPITest):
def setUp(self):
super().setUp()

# TODO: this can be improved, but the current authorization method requires it
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bento007 do you know what's going on here? Looks like the mock below isn't enough since this line:
x = assert_authorized_token(token, CorporaAuthConfig().curation_audience)
is calling the config in the parameters, which happens before the mock is called.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should also exist in my function but it throws an error. I'll double check.

if not db_session.query(DbCollection.id).filter(DbCollection.id == collection_id).first():
business_logic = get_business_logic()

# TODO: double lookup
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we need to decide if we want to always do a double lookup or if those methods only make sense when using either the canonical or the version id. I can look at the notebooks and see what makes sense, or just add the double lookup anywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this function

def get_infered_collection_version_else_forbidden(collection_id: str) -> Optional[CollectionVersion]:
"""
Infer the collection version from either a CollectionId or a CollectionVersionId and return the CollectionVersion.
:param collection_id: identifies the collection version
:return: The CollectionVersion if it exists.
"""
version = get_business_logic().get_published_collection_version(CollectionId(collection_id))
if version is None:
version = get_business_logic().get_collection_version(CollectionVersionId(collection_id))
if version is None:
raise ForbiddenHTTPException()
return version

We can punt on the double look-up decision and be inefficient for now for the sake of time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I will use that function.

Copy link
Contributor

@Bento007 Bento007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few tings to address

raise ForbiddenHTTPException(f"Dataset {dataset_id} does not exist")

collection_version = business_logic.get_collection_version(CollectionVersionId(collection_id))
# If the collection does not exist, it means that the dataset is orphaned and therefore we cannot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this happen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't supposed to happen, but I added it anyway. For what it's worth, get_collection_version returns an Optional so pylance will force you to add the if is not None condition, or you'll get a warning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a condition we can test for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary. Also the test should rely on a database state that's impossible to achieve. If you prefer, we can add a TODO to double check or specify that the check is only to prevent a possible edge case that should not be possible in the system.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually nvm, this was a comment taken from the portal API. In this case, this will simply be raised if a wrong collection_id is specified (i.e., a collection_id that doesn't contain the specified dataset_id). I removed the comment.

# TODO: deduplicate from ApiCommon. We need to settle the class/module level debate before can do that
url = body.get("url", body.get("link"))
business_logic = get_business_logic()
dataset_version = business_logic.get_dataset_version(DatasetVersionId(dataset_id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lin 130-136 is duplicated logic from a the delete handler. We should factor is out.

ebezzi and others added 3 commits December 14, 2022 14:52
…_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>
…_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>
@Bento007 Bento007 merged commit d1b5c45 into tsmith/3616-curation-api Dec 15, 2022
@Bento007 Bento007 deleted the ebezzi/curation-api-dataset branch December 15, 2022 18:11
Bento007 added a commit that referenced this pull request Dec 16, 2022
* fix typo

* resolve conflicts

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation api WIP

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidate tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* resolve DOIs

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix s3_upload_credentials.py

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix some get collection id tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix post Revisions

Fix some exiting tests after merging from target branch.

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix Get Collections public test

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collections tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* feat: curation API for datasets (#3708)

* init

* stuff

* baby steps

* new stuff

* end

* more stuff

* Remove ApiCommon

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* PR comments

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* fixing patch tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidating tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* fixing get portal api index

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Update tests/unit/backend/layers/business/test_business.py

* Update backend/layers/business/business.py

* fix typos

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Emanuele Bezzi <ebezzi@chanzuckerberg.com>
ebezzi added a commit that referenced this pull request Dec 19, 2022
* fix typo

* resolve conflicts

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation api WIP

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidate tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* resolve DOIs

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix s3_upload_credentials.py

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix some get collection id tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix post Revisions

Fix some exiting tests after merging from target branch.

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix Get Collections public test

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collections tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* feat: curation API for datasets (#3708)

* init

* stuff

* baby steps

* new stuff

* end

* more stuff

* Remove ApiCommon

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* PR comments

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* fixing patch tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidating tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* fixing get portal api index

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Update tests/unit/backend/layers/business/test_business.py

* Update backend/layers/business/business.py

* fix typos

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* init

* rollback

* checkpoint

* checkpoint

* checkpoint

* checkpoint

* more tests

* Asset tests

* more test fixes

* more test fixes

* comment

* Linter

* fix

* linter

* fixes

* improve tests

* tests

* remove monkeypatch

* Update backend/portal/api/curation/v1/curation/collections/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>
ebezzi added a commit that referenced this pull request Dec 19, 2022
* prep

* more stuff

* missing line

* changes

* bunch of changes

* interface and tests

* Update backend/layers/common/entities.py

Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>

* temp stuff

* refactor, add documentation

* more refactor

* temp

* typo

* Update backend/layers/persistence/persistence.py

Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>

* more tests

* Update backend/layers/persistence/persistence.py

Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>

* Update backend/layers/persistence/persistence.py

Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>

* more tests

* refactor

* Update backend/layers/persistence/persistence.py

Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>

* more tests

* even more stuff

* Changes

* PR comments

* PR suggestions, as code changes

* changes

* Comments

* Remove owner from collection metadata

* rename

* more work

* Remove all authorization from the business logic layer

* Assertions for metadata validation errorS

* More and more stuff

* convert everything to id classes

* First pass

* more files

* more stuff

* TestCreateCollection ✅

* TestGetCollectionVersion ✅

* TestGetAllCollections ✅

* Add one more test

* chore: business layer tests for update collection (#3524)

* chore: business layer tests for update collection

* Update tests/unit/backend/layers/business/test_business.py

Co-authored-by: Daniel Hegeman <daniel.hegeman@chanzuckerberg.com>

* remove old test

Co-authored-by: Daniel Hegeman <daniel.hegeman@chanzuckerberg.com>

* signature

* Overhaul dataset processing status

* missing files

* fix enum mixin

* comments

* few fixes

* TestUpdateCollectionDatasets ✅

* TestGetDataset ✅

* TestUpdateDataset ✅

* TestCollectionOperations ✅

* Link validation

* Move business interface to another file

* initial dump

* missing files

* more tests

* stuff

* missing files

* more stuff

* chore: add DatasetMetadata missing fields (#3572)

* changes

* Add name, x_approximate_distribution

* baby steps

* test__get_collection__ok ✅

* many more tests ✅

* small assertion

* Link names can be optional

* stuff

* fix bug + enforce immutability in several places

* many more tests ✅

* moving to datasets

* chore: deepcopy in persistence_mock (#3578)

* deepcopy in persistence_mock

* Added comment

* Add DatasetArtifactId

* more tests

* start

* functions

* baby steps

* ingest_dataset now also returns canonical dataset_id

* test__get_all_datasets_for_index_with_ontology_expansion ✅

* missing files

* test__get_dataset_assets ✅

* rework base class

* revision tests - first pass

* stuff

* TestRevision ✅

* TestDeleteRevision ✅

* Fix a bunch of pending tests

* chore: add dataset_version -> collection_id link

* temp

* chore: add methods to get versions from a canonical collection_id

* one more test

* upload tests

* more tests ✅

* upload link tests ✅

* Publish collection ✅

* chore: add dataset_version -> collection_id link (#3585)

* chore: collection versions can only be created one at a time (#3582)

* chore: collection versions can only be created one at a time

* add a case for non existing collections

* chore: add methods to get versions from a canonical collection_id (#3586)

* last minute changes

* initial stuff

* more stuff

* several advancements

* stuff

* more stuff

* fix: Migration task fails to start (#3588)

* Fix commits

* stuff

* stuff

* temp change

* unit tests attempt

* more unit tests

* more unit tests

* happy path

* s3 provider

* more implementations

* stuff

* Changes

* add mock

* stuff

* typo

* typo

* meta endpoint

* conflict

* remove outdated stuff

* Remove old test

* one more test

* fixes

* stuff

* tmp remove curator_name

* Add field

* one fix

* more stuff

* overhaul

* Rename class

* rename published_at -> originally_published_at

* feat: explicit canonical collection (#3622)

* stuff

* tmp remove curator_name

* Add field

* one fix

* more stuff

* overhaul

* Rename class

* rename published_at -> originally_published_at

* timestamps

* timestamps 2

* feat: persistence layer

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: add persistence implementation (first draft)

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: add finalization of canonical dataset on collection version publish

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: redesign canonical dataset (#3627)

* feat: canonical dataset

* published_at in API

* fix tests

* partial changes

* add set-up to create schema from persistence orm

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fixes to the ORM

* fix method call

* fix: use DatabaseProvider for test_business and update persistence layer accordingly

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* stuff

* stuff

* fix TestCollectionOperations tests

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* persistence layer changes (#3650)

* checkpoint

* more stuff

* one more test

* stuff and more stuff

* final

* tests

* changes

* pr feedback

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* Various bugs&fixes

* fixes

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix tests and pr feedback

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: processing layer (#3603)

* Fixes for API tests

* get_collection_index

* Missing files

* bunch of changes

* properly set revision_of

* failure

* failure layer

* more merge

* feat: migration script + some fixes (#3681)

* first pass

* second pass

* stuff

* stuff

* more stuff

* bunch of changes

* script

* parametrize db name

* docs

* more debug stuff

* remove debug stuff

* black and flake8

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* fix: publisher metadata correct type (#3699)

* fix: publisher metadata correct type

* fix: publisher metadata correct type

* typo

* fix: allows business tests to be run by both persistence mock and as an integration test with postgres docker container

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* lint

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: revise at for curation api (#3675)

Add revised_at and validation message to backend.layers.common.entity

- change UserInfo.user_id to a property. Simplifies accessing the variable.
- use pydantic.datalclass. This gives us type checking for free.
- Moved errors in BusinessException to be an instance variable. Making it is class variable will have unintended consequences.
- Fixed pointer in persistence_mock.py
- Populating revised_at in persistence_mock.py

* fix merge conflict

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* fix: get portal API tests working with persistence mock

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: get portal API tests working with database provider implementation

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* refactor: simplify session management + include option to run tests as unit tests or integration tests

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* pr feedback

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: processing tests (#3715)

* fix: strip -> removesuffix (#3716)

* chore: redesign run tests in GHA (#3694)

* chore: redesign run tests in GHA

* pip

* switch to docker

* remove import

* all tests

* Lint

* overhaul test commands

* fix workflow

* feat: add canonical collection tombstoning to portal redesign (#3740)

* linting

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: support canonical collection tombstoning

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* linting + pr feedback

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: set validation message (#3735)

* feat: set validation message

* Lint

* chore: redesign miscellaneous fixes (#3739)

* stuff

* lint

* fix test

* add published_at to ignored fields for revision diff

* add crossref provider

* PR comments & lint

* move explorer_url

* missing files

* chore: restore processing tests (#3736)

* not quite there yet

* stuff

* missing files

* fix

* lint

* chore: restore cloudfront invalidation (#3761)

* chore: restore cloudfront invalidation

* rename

* rename file

* missing files

* chore: curator name (#3780)

* feat: translate curation api for redesign (#3690)

* fix typo

* resolve conflicts

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation api WIP

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidate tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* resolve DOIs

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix s3_upload_credentials.py

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix some get collection id tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix post Revisions

Fix some exiting tests after merging from target branch.

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix Get Collections public test

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collections tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* feat: curation API for datasets (#3708)

* init

* stuff

* baby steps

* new stuff

* end

* more stuff

* Remove ApiCommon

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* PR comments

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* fixing patch tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidating tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* fixing get portal api index

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Update tests/unit/backend/layers/business/test_business.py

* Update backend/layers/business/business.py

* fix typos

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Emanuele Bezzi <ebezzi@chanzuckerberg.com>

* chore: curator API fix tests (#3782)

* fix typo

* resolve conflicts

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation api WIP

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidate tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* resolve DOIs

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix s3_upload_credentials.py

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix some get collection id tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix post Revisions

Fix some exiting tests after merging from target branch.

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix Get Collections public test

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collections tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* feat: curation API for datasets (#3708)

* init

* stuff

* baby steps

* new stuff

* end

* more stuff

* Remove ApiCommon

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* PR comments

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* fixing patch tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidating tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* fixing get portal api index

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Update tests/unit/backend/layers/business/test_business.py

* Update backend/layers/business/business.py

* fix typos

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* init

* rollback

* checkpoint

* checkpoint

* checkpoint

* checkpoint

* more tests

* Asset tests

* more test fixes

* more test fixes

* comment

* Linter

* fix

* linter

* fixes

* improve tests

* tests

* remove monkeypatch

* Update backend/portal/api/curation/v1/curation/collections/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* chore: add curator_name to migration script (#3791)

* chore: redesign last minute fixes (#3792)

* fixes

* more fixes

* lint

* feat: add integration test mode as a gha test step + makefile command (#3788)

* feat: add integration test mode as a gha test step + makefile command

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* pin dataclasses-json version in requirements + reintroduce step to rebuild backend container on changed files/dockerfiles

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* add pydantic to processing requirements, rebuild containers with no cache

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* remove --no-cache flag from rebuild commands

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* remove pydantic to pass tests

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* re-add dataclasses import from std lib

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>
Co-authored-by: Emanuele Bezzi <ebezzi@chanzuckerberg.com>

* chore: last minute fixes #2 (#3794)

* update ontology test

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>
Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>
Co-authored-by: Andrew Tolopko <atolopko@chanzuckerberg.com>
Co-authored-by: Daniel Hegeman <daniel.hegeman@chanzuckerberg.com>
Co-authored-by: Alex Lokshin <alokshin@chanzuckerberg.com>
Co-authored-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>
Co-authored-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Nayib Gloria <55710092+nayib-jose-gloria@users.noreply.github.com>
Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>
nayib-jose-gloria added a commit that referenced this pull request Jan 23, 2023
* feat(MUIv5): Upgrade SDS and MUI (#3734)

* feat(MUIv5): Upgrade SDS and MUI

* update ButtonIcon

* fix Filter

* upgrade sds

* wip fix highlight

* replace Blueprint HTMLTable

* fix QuickSelect style

* remove lint ignore

* fix QuickSelect style

* fix: Fix the re-tagging issue due to invalid registryId (#3781)

Fixes the following error in Docker re-tag:

```
[ERROR]: error getting Image: operation error ECR: BatchGetImage, https response error StatusCode: 400, RequestID: f00322e8-7a33-4e28-a07f-93f2bdaa25bb, InvalidParameterException: Invalid parameter at 'registryId' failed to satisfy constraint: 'must satisfy regular expression [0-9]{12}'
```

* feat: portal redesign (#3485)

* prep

* more stuff

* missing line

* changes

* bunch of changes

* interface and tests

* Update backend/layers/common/entities.py

Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>

* temp stuff

* refactor, add documentation

* more refactor

* temp

* typo

* Update backend/layers/persistence/persistence.py

Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>

* more tests

* Update backend/layers/persistence/persistence.py

Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>

* Update backend/layers/persistence/persistence.py

Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>

* more tests

* refactor

* Update backend/layers/persistence/persistence.py

Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>

* more tests

* even more stuff

* Changes

* PR comments

* PR suggestions, as code changes

* changes

* Comments

* Remove owner from collection metadata

* rename

* more work

* Remove all authorization from the business logic layer

* Assertions for metadata validation errorS

* More and more stuff

* convert everything to id classes

* First pass

* more files

* more stuff

* TestCreateCollection ✅

* TestGetCollectionVersion ✅

* TestGetAllCollections ✅

* Add one more test

* chore: business layer tests for update collection (#3524)

* chore: business layer tests for update collection

* Update tests/unit/backend/layers/business/test_business.py

Co-authored-by: Daniel Hegeman <daniel.hegeman@chanzuckerberg.com>

* remove old test

Co-authored-by: Daniel Hegeman <daniel.hegeman@chanzuckerberg.com>

* signature

* Overhaul dataset processing status

* missing files

* fix enum mixin

* comments

* few fixes

* TestUpdateCollectionDatasets ✅

* TestGetDataset ✅

* TestUpdateDataset ✅

* TestCollectionOperations ✅

* Link validation

* Move business interface to another file

* initial dump

* missing files

* more tests

* stuff

* missing files

* more stuff

* chore: add DatasetMetadata missing fields (#3572)

* changes

* Add name, x_approximate_distribution

* baby steps

* test__get_collection__ok ✅

* many more tests ✅

* small assertion

* Link names can be optional

* stuff

* fix bug + enforce immutability in several places

* many more tests ✅

* moving to datasets

* chore: deepcopy in persistence_mock (#3578)

* deepcopy in persistence_mock

* Added comment

* Add DatasetArtifactId

* more tests

* start

* functions

* baby steps

* ingest_dataset now also returns canonical dataset_id

* test__get_all_datasets_for_index_with_ontology_expansion ✅

* missing files

* test__get_dataset_assets ✅

* rework base class

* revision tests - first pass

* stuff

* TestRevision ✅

* TestDeleteRevision ✅

* Fix a bunch of pending tests

* chore: add dataset_version -> collection_id link

* temp

* chore: add methods to get versions from a canonical collection_id

* one more test

* upload tests

* more tests ✅

* upload link tests ✅

* Publish collection ✅

* chore: add dataset_version -> collection_id link (#3585)

* chore: collection versions can only be created one at a time (#3582)

* chore: collection versions can only be created one at a time

* add a case for non existing collections

* chore: add methods to get versions from a canonical collection_id (#3586)

* last minute changes

* initial stuff

* more stuff

* several advancements

* stuff

* more stuff

* fix: Migration task fails to start (#3588)

* Fix commits

* stuff

* stuff

* temp change

* unit tests attempt

* more unit tests

* more unit tests

* happy path

* s3 provider

* more implementations

* stuff

* Changes

* add mock

* stuff

* typo

* typo

* meta endpoint

* conflict

* remove outdated stuff

* Remove old test

* one more test

* fixes

* stuff

* tmp remove curator_name

* Add field

* one fix

* more stuff

* overhaul

* Rename class

* rename published_at -> originally_published_at

* feat: explicit canonical collection (#3622)

* stuff

* tmp remove curator_name

* Add field

* one fix

* more stuff

* overhaul

* Rename class

* rename published_at -> originally_published_at

* timestamps

* timestamps 2

* feat: persistence layer

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: add persistence implementation (first draft)

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: add finalization of canonical dataset on collection version publish

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: redesign canonical dataset (#3627)

* feat: canonical dataset

* published_at in API

* fix tests

* partial changes

* add set-up to create schema from persistence orm

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fixes to the ORM

* fix method call

* fix: use DatabaseProvider for test_business and update persistence layer accordingly

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* stuff

* stuff

* fix TestCollectionOperations tests

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* persistence layer changes (#3650)

* checkpoint

* more stuff

* one more test

* stuff and more stuff

* final

* tests

* changes

* pr feedback

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* Various bugs&fixes

* fixes

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix tests and pr feedback

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: processing layer (#3603)

* Fixes for API tests

* get_collection_index

* Missing files

* bunch of changes

* properly set revision_of

* failure

* failure layer

* more merge

* feat: migration script + some fixes (#3681)

* first pass

* second pass

* stuff

* stuff

* more stuff

* bunch of changes

* script

* parametrize db name

* docs

* more debug stuff

* remove debug stuff

* black and flake8

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* fix: publisher metadata correct type (#3699)

* fix: publisher metadata correct type

* fix: publisher metadata correct type

* typo

* fix: allows business tests to be run by both persistence mock and as an integration test with postgres docker container

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* lint

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: revise at for curation api (#3675)

Add revised_at and validation message to backend.layers.common.entity

- change UserInfo.user_id to a property. Simplifies accessing the variable.
- use pydantic.datalclass. This gives us type checking for free.
- Moved errors in BusinessException to be an instance variable. Making it is class variable will have unintended consequences.
- Fixed pointer in persistence_mock.py
- Populating revised_at in persistence_mock.py

* fix merge conflict

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* fix: get portal API tests working with persistence mock

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: get portal API tests working with database provider implementation

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* refactor: simplify session management + include option to run tests as unit tests or integration tests

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* pr feedback

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: processing tests (#3715)

* fix: strip -> removesuffix (#3716)

* chore: redesign run tests in GHA (#3694)

* chore: redesign run tests in GHA

* pip

* switch to docker

* remove import

* all tests

* Lint

* overhaul test commands

* fix workflow

* feat: add canonical collection tombstoning to portal redesign (#3740)

* linting

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: support canonical collection tombstoning

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* linting + pr feedback

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: set validation message (#3735)

* feat: set validation message

* Lint

* chore: redesign miscellaneous fixes (#3739)

* stuff

* lint

* fix test

* add published_at to ignored fields for revision diff

* add crossref provider

* PR comments & lint

* move explorer_url

* missing files

* chore: restore processing tests (#3736)

* not quite there yet

* stuff

* missing files

* fix

* lint

* chore: restore cloudfront invalidation (#3761)

* chore: restore cloudfront invalidation

* rename

* rename file

* missing files

* chore: curator name (#3780)

* feat: translate curation api for redesign (#3690)

* fix typo

* resolve conflicts

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation api WIP

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidate tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* resolve DOIs

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix s3_upload_credentials.py

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix some get collection id tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix post Revisions

Fix some exiting tests after merging from target branch.

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix Get Collections public test

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collections tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* feat: curation API for datasets (#3708)

* init

* stuff

* baby steps

* new stuff

* end

* more stuff

* Remove ApiCommon

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* PR comments

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* fixing patch tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidating tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* fixing get portal api index

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Update tests/unit/backend/layers/business/test_business.py

* Update backend/layers/business/business.py

* fix typos

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Emanuele Bezzi <ebezzi@chanzuckerberg.com>

* chore: curator API fix tests (#3782)

* fix typo

* resolve conflicts

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation api WIP

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* curation_api

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidate tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* resolve DOIs

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix s3_upload_credentials.py

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix some get collection id tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix post Revisions

Fix some exiting tests after merging from target branch.

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix Get Collections public test

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collections tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Fix get collection tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* feat: curation API for datasets (#3708)

* init

* stuff

* baby steps

* new stuff

* end

* more stuff

* Remove ApiCommon

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* Update backend/portal/api/curation/v1/curation/collections/collection_id/datasets/dataset_id/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* PR comments

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* fixing patch tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* consolidating tests

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* fixing get portal api index

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* Update tests/unit/backend/layers/business/test_business.py

* Update backend/layers/business/business.py

* fix typos

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>

* init

* rollback

* checkpoint

* checkpoint

* checkpoint

* checkpoint

* more tests

* Asset tests

* more test fixes

* more test fixes

* comment

* Linter

* fix

* linter

* fixes

* improve tests

* tests

* remove monkeypatch

* Update backend/portal/api/curation/v1/curation/collections/actions.py

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* chore: add curator_name to migration script (#3791)

* chore: redesign last minute fixes (#3792)

* fixes

* more fixes

* lint

* feat: add integration test mode as a gha test step + makefile command (#3788)

* feat: add integration test mode as a gha test step + makefile command

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* pin dataclasses-json version in requirements + reintroduce step to rebuild backend container on changed files/dockerfiles

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* add pydantic to processing requirements, rebuild containers with no cache

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* remove --no-cache flag from rebuild commands

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* remove pydantic to pass tests

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* re-add dataclasses import from std lib

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>
Co-authored-by: Emanuele Bezzi <ebezzi@chanzuckerberg.com>

* chore: last minute fixes #2 (#3794)

* update ontology test

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>
Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>
Co-authored-by: Andrew Tolopko <atolopko@chanzuckerberg.com>
Co-authored-by: Daniel Hegeman <daniel.hegeman@chanzuckerberg.com>
Co-authored-by: Alex Lokshin <alokshin@chanzuckerberg.com>
Co-authored-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>
Co-authored-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Nayib Gloria <55710092+nayib-jose-gloria@users.noreply.github.com>
Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* fix: missing 'e' in explorer_url + dataset id in meta endpoint (#3795)

* fix: missing 'e' in explorer_url

* more

* feat: dynamic sitemap and robots.txt file (#3753)

Co-authored-by: Timmy Huang <thuang@chanzuckerberg.com>

* fix: version_id in meta endpoint (#3809)

* fix: version_id in meta endpoint

* fix tests

* fix: support for collection partial updates (#3812)

* fix: support for collection partial updates

* Lint

* fix e2e after MUIv5 upgrade (#3813)

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* fix: y-axis scrolling jump on heatmap download (#3691)

* fix: y-axis scrolling jump on heatmap download

* restoring scrollTop position after create image and before download

* feat: adding tint screen on download load

* fix: functional tests portal redesign compatibility (#3815)

* fix: failing functional tests related to portal redesign

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: failing functional tests related to portal redesign

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: lint

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: grab id from DatasetId

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: rollback container-functionaltest change

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* test(wmg): Fix Source Data (#3816)

* test(wmg): Fix Data Source part 2 (#3817)

* fix: only add original_ids on unpublished revisions of published datasets (#3820)

* fix: only add original_ids on unpublished revisions of published datasets

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: fix and add portal API tests + update test method return type hints

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* lint

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: bad CrossrefProvider imports + reintroduce auth0 retry (#3819)

* fix: bad CrossrefProvider imports

* fix tests

* lint

* last minute fix

* fix: for portal API dataset status and delete endpoints, fetch latest collection version (#3822)

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* chore: fix test snapshot (#3818)

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.local>
Co-authored-by: Daniel Hegeman <daniel.hegeman@chanzuckerberg.com>

* chore: support INITIALIZED dataset status (#3824)

* chore: support INITIALIZED dataset status

* try to fix tests

* feat: PATCH collection curator API redesign (#3814)

* init

* more tests

* end

* PR comments

* fix import

* fix: remove original_id and fix post-deploy tests (#3826)

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: test_api_key_crud functional test (#3828)

* fix: remove original_id and fix post-deploy tests

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: test_api_key_crud functional test

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* chore: redesign dataset submissions lambda (#3827)

* init

* more

* fix tests

* lint

* mock config

* Bunch of fixes

* more changes

* Missing dependencies

* fix: redesign submissions lambda fix 1 (#3832)

* fix: redesign submissions lambda fix 1

* lint

* fix: redesign submissions lambda fix 2 (#3833)

* fix: redesign submissions lambda fix 2

* missing providers

* chore: double lookup for GET dataset in curation API (#3834)

* chore: double lookup for GET dataset in curation API

* linter

* add comment

* fix: send slack alerts on failing post-deploy tests (#3821)

* fix: send slack alerts on failing post-deploy tests

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: use workflow status GHA action step to check overall workflow status at the end of workflow

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix workflow overall failure check

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: marker genes unavailable for cell types <25 (#3825)

* feat: marker genes unavailable for cell types <25

Co-authored-by: atarashansky <atarashansky@chanzuckerberg.com>

* chore: reintroduce wmg unit tests + migrate to redesign (#3837)

* chore: reintroduce wmg unit tests to a separate action
* step 1: fix get_dataset_s3_uris
* add layers to dockerfile
* 2nd endpoint
* remove raw h5ad

* chore: add filter for local.h5ad to wmg extract function (#3843)

* chore: reintroduce wmg unit tests to a separate action

* rename

* step 1: fix get_dataset_s3_uris

* add layers to dockerfile

* dep

* Fix

* checkpoint

* test #1

* 2nd endpoint

* lint

* remove raw h5ad

* fix

* fix: add local.h5ad trailing

* Linter

* fix(fmg): fix ordering and remove p-value (#3831)

* fix: mainting top-n ordering for marker genes

* accept new response in frontend

* remove p-value from chart

* add back p-value to api yml

* fmt + lint

* update tests

* p-value back in example

* fix remaining tests

* blacked and fixed unit tests

* blacked unit tests

* fixed formatting

Co-authored-by: atarashansky <atarashansky@chanzuckerberg.com>
Co-authored-by: Seve Badajoz <severiano.badajoz@chanzuckerberg.com>
Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.local>
Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* feat: Exclude Blood from FMG (#3839)

* feat: Exclude Blood from FMG

* change effect size to marker score (#3855)

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* chore(deps): bump json5 from 1.0.1 to 1.0.2 in /frontend (#3842)

Bumps [json5](https://github.com/json5/json5) from 1.0.1 to 1.0.2.
- [Release notes](https://github.com/json5/json5/releases)
- [Changelog](https://github.com/json5/json5/blob/main/CHANGELOG.md)
- [Commits](json5/json5@v1.0.1...v1.0.2)

---
updated-dependencies:
- dependency-name: json5
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Timmy Huang <tihuan@users.noreply.github.com>

* docs: add discover api data model img for cell census schema docs reference (#3844)

* chore: RAW_H5AD type applied in migration + processing (#3862)

* chore: RAW_H5AD type applied in migration + processing

* Lint

Co-authored-by: Nayib Gloria <55710092+nayib-jose-gloria@users.noreply.github.com>

* feat(WMG+FMG): improve GE layout, rebuild y axis chart, and fix inconsistent gene spacing (#3811)

* redo GE layout and rebuild y axis chart

* remove vestigial variables

* update

* checkpoint

* update

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.local>
Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* update tests (#3868)

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* fix(portal redesign): handle raw h5ads at api layer in portal and curation API (#3873)

* fix(portal redesign): handle raw h5ads at api layer in portal and curation API

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* remove unused import

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix(wmg): don't send requests if only genes or tissues have been removed (#3859)

* fix(wmg): don't send requests if only genes or tissues have been removed

* lint

Co-authored-by: Seve Badajoz <severiano.badajoz@chanzuckerberg.com>

* feat(fmg): shift marker genes to front if already selected when adding (#3874)

* branch init

* feat(fmg): shift marker genes to front if already selected when adding

Co-authored-by: Seve Badajoz <severiano.badajoz@chanzuckerberg.com>

* fix: downloaded image being cut off (#3869)

* fix(wmg): y axis is now properly sticky (#3877)

* fix y axis stickiness when scrolling horizontally

* add comment

* remove inline container

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* fix(wmg): image download (#3880)

* chore(wmg): rewrite x axis using react (#3879)

* rewrite x axis as react

* update

* update

* prettier

* remove dup prop

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* fix(WMG): remove step 2 when genes are added and heatmap is loading (#3763)

* branch init

* remove isLoading check for gene step removal

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* fix(wmg): Select Tissues/Genes dropdown not triggering onClick event (#3881)

* chore(fmg): fmg documentation (#3870)

* fmg documentation

* update deps

* update deps

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* feat: FMG tooltip doc link (#3852)

* feat: FMG tooltip doc link

* adding doc link

* Update routes.ts

Co-authored-by: atarashansky <atarashansky@chanzuckerberg.com>

* fix(wmg): echarts init (#3884)

* change link (#3887)

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* fix e2e after x axis rewrite (#3888)

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* chore: only return dataset_deployments on CXG completion (#3851)

* chore: only return dataset_deployments on CXG completion

* fix test + lint

* empty list instead of None

* chore: translates the portal API module from class-based to function-based (#3878)

* init

* chore: migrate portal API to be function based instead of class based

* linter

* rename router.py -> providers.py

* Add missing files

* chore: update alembic migration to use new database (#3841)

- Upgrade SQLAlchemy for the processing container so it matches the other containers
- Upgrade Alembic for local development.
- factor out "persistance_schema" into a constant to avoid naming issues.
- Update alembic to support both the new and legacy ORM for now. The legacy ORM can be removed in a future migration.
- Add an Alembic migration for the redesign database changes. This is will not affect the legacy database.
- Modify `migrate_redesign_write` so it no longer create the new database tables. This should be handled by the automated migration processes in GHA.
- Fixe the legacy ORM to match the actual database.
- Update `create_db` to use the new ORM.
- updated sqlalchemy in the processing container to be current with other containers.
- Add Make recipe `db/check` to make it easier to detect if the ORM is different from the database. 
- Add a try..except block in `DatabaseProvider._drop_schema` so prevent breaking if the schema does not exist.
- Add support for creating both the legacy and new database.
- removing CORPORA_LOCAL_DEV from setup_dev_data.sh.

* fix(wmg): still check for equality on non-checked filters when checking to clobber queries (#3898)

Co-authored-by: Seve Badajoz <severiano.badajoz@chanzuckerberg.com>

* update (#3886)

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* update (#3901)

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* chore: update marker gene unit test  (#3902)

* update test

* empty

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* update (#3885)

Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>

* refactor: persistence layer clean-up (#3835)

* refactor: persistence layer clean-up

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* lint

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* update missing id changes

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* lint fix

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* more missing id's

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* remove unused import

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* correct id name

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* lint fix

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* id fix

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* id fix

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* more ids

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* id fix

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* version id fix

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* lint fix

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* autogenerate migration script for db field changes + update README links

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* update makefile test command documentation

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* pr feedback on docs

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* update migration script + db/local/load-schema + migration docs

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* lint fixes

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* remove unused imports

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* tabs

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* tabs

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* feat: unpublished collections can be looked up by canonical id (#3907)

* feat: unpublished collections can be looked up by canonical collection_id

* complete docstring

* change

* valid -> invalid

* fix: Documentation Tooltip on Icon Hover Only (#3893)

* fix: icon doc hover only

* using question mark SVG for tooltip icon hover

* tooltip icon hover colors

* fixing linting error "Type 'Element' is not assignable to type"

* linting unused var

* linting

* importing svg directly and changing brightness

Co-authored-by: atarashansky <atarashansky@chanzuckerberg.com>

* fix(coverage): adding code coverage (#3894)

- add code coverage for python unit tests
- run coverage for every PR and upload report to Codecov
- add make recipes to generate coverage report
- add allure test report for unit tests
- update coverage version

* fix: curation API differentiates unpublished collections for revisions (#3912)

* Remove feature branch from push_test (#3915)

* chore(redesign): Clean up processing container files (#3911)

* Clean up old processing container files
- adding json logs to processing container
- clean up old processing container files

* Update backend/layers/processing/process.py
* remove old upload_failure, upload_success, and submission lambda code

* feat(fmg): add analytics (#3896)

* branch init

* branch init

* feat(fmg): add analytics

Co-authored-by: Seve Badajoz <severiano.badajoz@chanzuckerberg.com>

* feat(fmg): default to ttest (#3919)

* branch init

* feat(fmg): default to ttest

Co-authored-by: Seve Badajoz <severiano.badajoz@chanzuckerberg.com>

* fix processing logs (#3922)

* feat(fmg): remove feature flag (#3943)

* branch init

* feat(fmg): remove feature flag

Co-authored-by: Seve Badajoz <severiano.badajoz@chanzuckerberg.com>

* feat: newly unpublished collections return canonical collection_id in API (#3944)

* feat: newly published collections return canonical collection_id in the API

* refactor to function

* lint

* add test for canonical id

Co-authored-by: Trent Smith <trent.smith@chanzuckerberg.com>

* feat: newly unpublished collections return canonical collection_id in API (#3944)

* feat: newly published collections return canonical collection_id in the API

* refactor to function

* lint

* add test for canonical id

Co-authored-by: Trent Smith <trent.smith@chanzuckerberg.com>
(cherry picked from commit ec2f61f)

* chore: release main->staging (#4019)

* fix: FMG icon not hiding for < 25 cells (#3946)

* feat(curation API): newly unpublished collections return canonical collection_id in API (#3948)

* feat: data submitters want collections to document their consortia (#3875)

feat: data submitters want collections to document their consortia

feat: add default empty consortia to create collection

fix: fix texts for consortia on portal-api

fix: update default for consortia to empty list

feat: Add optional Consortia dropdown in New and Edit Collection with the required enumeration of approved consortia (#3642)

feat: update edit collection form consortia dropdown with latest SDS dropdown component and style adjustments.

fix: Linting and minor style updates.

feat: added tests for updating consortia

feat: validate consortia

feat: add consortia to curation/discover API

refactor: mock valid consortia

fix: also override valid_consortia in business layer tests for #348

refactor: add explicit exception type for invalid collection metadata for #348

fix: fix format errors for #348

fix: fix lint errors for #348

fix: make consortia required on portal API collection response for #348

* fix: disable portal on consortia dropdown to maintain focus within create collection dialog.

* refactor: break apart large tests

* feat: added new discovery api test

Co-authored-by: Fran McDade <franmcdade@Frans-MacBook-Pro.local>

* chore: cxg links are permalinks for new unpublished collections (#3957)

* chore: cxg links are permalinks for new unpublished collections

* curation API

* rename

* lint

* chore: use canonical collection_id in curation API responses (#3964)

* Fix curation post dataset for unpublished collections (#3966)

* fix: support null metadata and double lookup for write datasets (#3968)

* fix: support null metadata and double lookup for write datasets

* PR changes

* split test into 2

* fix: improve dataset identifiers (#3967)

* fix: improve dataset identifiers

* Linter

* Test declutter

* new test+overhaul

* refactor: Refactor consortia type (#3973) (#3974)

Co-authored-by: Fran McDade <franmcdade@Frans-MacBook-Pro.local>

* fix: local dev database config / secret name (#3985)

Recent commit 4e6c4e2 simplified the '../database_local' secret name
to '../database'. However, local dev case exceptions still caused the
database config to look for a '../database_local' secret.

* fix: handle null metadata in curation API GET collection (take 2) (#3986)

* fix: handle null metadata case (take 2)

* Linter

* Update scripts/setup_dev_data.sh

* fix: Portal Get /collections (#3980)

* Fix Portal Get /collections

- We need to return the canonical collection_id for a new unpublished collections when list all the collections. This technically isn't user facing problem but it would mean we no longer have working functional test if we don't fix this way. Since we do not openly disclose the version ID we wont be able to determine if a collection was successfully listed by get collection since creating a collection only return the canonical collection ID.

* test: update the revision functional test to use the right explorer_url (#3993)

* test: update the revision functional test to use the correct explorer_url

* lint

Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>

* fix: add double lookup in dataset submissions lambda (#3994)

* feat: adding search and details to dataset filter (#3905)

* chore: POST /datasets returns the canonical dataset_id (#3999)

* fix(curation api): GET /collections/{collection_id}/datasets/{dataset_id} (#3995)

* fix(curation api): GET /collections/{collection_id}/datasets/{dataset_id}
- added tests for revision_of in GET /collections/{collection_id}/datasets/{dataset_id}
- revision_of only refers to the published dataset when revising a dataset. Otherwise it is NULL.
- fix typing in test code

* return the canonical dataset id when POSTing a dataset.

* fix: check whether revision's consortia list has changed when determining whether collection has updates (#4003)

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* chore: GET collections/:collection_id should return the correct dataset_id (#4006)

* fix: add retries to dataset pipeline for failure and success handlers. (#4004)

* fix: add retries to dataset pipeline for failure and success handlers.

* fix typo

* feat: cronjob to remove old rdev stacks (#4008)

* fix: verion of happy-cleanup

* fix: Adapt RevisionStatusTag to use redesigned data model (#4009)

* fix: Adapt RevisionStatusTag to use redesigned data model

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix 'updated' logic

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* fix: stack name parsing bug

This version update fixes a JSON parsing bug in the Github Action

* docs: add valid consortia to swagger apis (#3997) (#4010)

* fix: sort consortia on collection create or edit (#3996) (#4015)

* fix: sort consortia on collection create or edit (#3996)

* fix: update collection handles DOI correctly (#4017)

* fix: check for Revision Changes accounts for deleted datasets in new data model + refactor Collection checks (#4016)

* fix: check for Revision Changes accounts for deleted datasets in new data model + refactor Collection checks

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

* pr feedback

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>
Co-authored-by: ashin-czi <109984998+ashin-czi@users.noreply.github.com>
Co-authored-by: David Rogers <dave@clevercanary.com>
Co-authored-by: Fran McDade <franmcdade@Frans-MacBook-Pro.local>
Co-authored-by: Emanuele Bezzi <ebezzi@chanzuckerberg.com>
Co-authored-by: Fran McDade <frano-m@users.noreply.github.com>
Co-authored-by: Daniel Hegeman <daniel.hegeman@chanzuckerberg.com>
Co-authored-by: Nayib Gloria <55710092+nayib-jose-gloria@users.noreply.github.com>
Co-authored-by: Jake Heath <76011913+jakeyheath@users.noreply.github.com>

Signed-off-by: nayib-jose-gloria <ngloria@chanzuckerberg.com>
Signed-off-by: Trent Smith <trent.smith@chanzuckerberg.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Timmy Huang <tihuan@users.noreply.github.com>
Co-authored-by: Alex Lokshin <alokshin@chanzuckerberg.com>
Co-authored-by: Emanuele Bezzi <ebezzi@chanzuckerberg.com>
Co-authored-by: Andrew Tolopko <atolopko-czi@users.noreply.github.com>
Co-authored-by: Andrew Tolopko <atolopko@chanzuckerberg.com>
Co-authored-by: Daniel Hegeman <daniel.hegeman@chanzuckerberg.com>
Co-authored-by: Trent Smith <trent.smith@chanzuckerberg.com>
Co-authored-by: Trent Smith <1429913+Bento007@users.noreply.github.com>
Co-authored-by: SethFeingold <52686508+sethfeingold@users.noreply.github.com>
Co-authored-by: Timmy Huang <thuang@chanzuckerberg.com>
Co-authored-by: atarashansky <atarashansky@chanzuckerberg.com>
Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.hsd1.ma.comcast.net>
Co-authored-by: ashin-czi <109984998+ashin-czi@users.noreply.github.com>
Co-authored-by: atarashansky <atarashansky@CZIMACOS3990.local>
Co-authored-by: Severiano Badajoz <sbadajoz@chanzuckerberg.com>
Co-authored-by: Seve Badajoz <severiano.badajoz@chanzuckerberg.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: David Rogers <dave@clevercanary.com>
Co-authored-by: Fran McDade <franmcdade@Frans-MacBook-Pro.local>
Co-authored-by: Fran McDade <frano-m@users.noreply.github.com>
Co-authored-by: Jake Heath <76011913+jakeyheath@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants