Error on cohort or label duplicates #889
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #872
Adds a check for duplicates in the labels or cohort table in order to error earlier and with a more helpful message when this occurs (currently, this tends to cause a duplicate primary key error far downstream when saving predictions, which can be difficult to debug).
Note that I actually haven't added a unit test for catching duplicates in the cohort because two routes (via query or labels) for generating these already do a
distinct
orgroup by
so these shouldn't actually occur. We could consider removing this logic and putting the burden on user (which might enforce better understanding of what triage is doing/expecting), but I'm not sure that's too worthwhile.@thcrock -- is the fact that there are two (essentially identical)
database_reflection.py
files (one insrc/triage/
and the other insrc/triage/component/architect/
) an artifact of the merge to a single repo? I'd like to consolidate -- thoughts/preferences on which is a better place for this it to live?