Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable filtering on relationships with mongo #234

Merged
merged 5 commits into from
Apr 7, 2020

Conversation

ml-evs
Copy link
Member

@ml-evs ml-evs commented Mar 16, 2020

Now #222 is done, this PR adds the ability to filter on relationships, e.g. /structures?filter=references.id HAS "dijkstra1968". The relevant part of the spec is section 6.2.6 (please check I am interpreting it correctly).

This is done with another post-processing method that looks for the pattern <entry_type>.<property> in a query, and replaces it with a search over the relationships dictionary.

@ml-evs ml-evs force-pushed the ml-evs/relationship_filtering branch from 32a5eab to 773e91c Compare March 17, 2020 14:28
@ml-evs ml-evs force-pushed the ml-evs/relationship_filtering branch from 773e91c to 0e49c16 Compare March 30, 2020 17:20
@ml-evs ml-evs changed the title [BLOCKED] Filtering on relationships Enable filtering on relationships with mongo Mar 30, 2020
@ml-evs ml-evs marked this pull request as ready for review March 30, 2020 17:21
@codecov
Copy link

codecov bot commented Mar 30, 2020

Codecov Report

Merging #234 into master will increase coverage by 0.14%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #234      +/-   ##
==========================================
+ Coverage   87.62%   87.77%   +0.14%     
==========================================
  Files          45       45              
  Lines        1932     1955      +23     
==========================================
+ Hits         1693     1716      +23     
  Misses        239      239              
Flag Coverage Δ
#unittests 87.77% <100.00%> (+0.14%) ⬆️
Impacted Files Coverage Δ
optimade/filtertransformers/mongo.py 96.68% <100.00%> (+0.32%) ⬆️
optimade/server/exception_handlers.py 79.36% <100.00%> (+2.17%) ⬆️
optimade/server/main.py 77.96% <100.00%> (+0.37%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d34f7a5...16c56ac. Read the comment docs.

@ml-evs ml-evs requested review from CasperWA and shyamd March 30, 2020 19:35
Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @ml-evs !

Seems all right to me.
Looking through the spec, I see no issue with this approach, as long as the database is self-consistent. In the sense that relationship links between resources are existent twice (once for each linked resource), i.e., a structure resource linked to a reference is aware of this link, as well is the reference aware of the link to the structure. If this is always upheld, then this approach is fine, even though it differs from the mentioned approach in the spec of a multi-step query.

The only thing I have is whether or not you will put in a debug logger for the transform method. Either is fine :)

optimade/filtertransformers/mongo.py Outdated Show resolved Hide resolved
@ml-evs
Copy link
Member Author

ml-evs commented Mar 31, 2020

Looking through the spec, I see no issue with this approach, as long as the database is self-consistent. In the sense that relationship links between resources are existent twice (once for each linked resource), i.e., a structure resource linked to a reference is aware of this link, as well is the reference aware of the link to the structure. If this is always upheld, then this approach is fine, even though it differs from the mentioned approach in the spec of a multi-step query.

So this doesn't circumvent the need for a multi-step query as it only works on the fields stored under relationships.<type>.data, i.e. id and description. Without this PR, we can't even filter relationships at all, as the query ?filter=relationships.structures.data.id HAS "x" doesn't work due to the nested representation of the relationships in the db (and would be overly verbose anyway). This just allows you to do the second part of the multistep query. (Though, it would be nice if we could support some way of doing this in one step...)

@ml-evs ml-evs force-pushed the ml-evs/relationship_filtering branch from 03dee04 to 1e1e2fa Compare March 31, 2020 11:18
@ml-evs ml-evs requested a review from CasperWA March 31, 2020 11:22
@ml-evs
Copy link
Member Author

ml-evs commented Mar 31, 2020

Thinking about of ways that this could be extended to cover that multi-step process... Without much of a logic chance, in post-processing we could return a succession of queries to run, e.g. structures?filter=references.doi HAS "10/123" could do the look-up in references first, then filter the structures by that. I guess if we don't implement that, we should raise a NotImplementedError when querying on anything other than <entry_type>.id.

Probably a separate PR for another day though X)

EDIT: this would not be possible unless collections can see/talk to each other, as the first step needs to query e.g. references and the second step needs to query structures.

@ml-evs
Copy link
Member Author

ml-evs commented Mar 31, 2020

I've added an error for queries on e.g. references.doi HAS x which we don't support at the moment. Due to some peculiarities in the test suite, I had to add a dedicated exception handler for NotImplementedError, otherwise the tests were not able to check the response without the exception being raised (thus failing the tests that the server responds without crashing). This is strange, as I could follow the error response being created perfectly well with the general exception handler, but at least now we can neatly give it the correct 501 status code.

@ml-evs ml-evs force-pushed the ml-evs/relationship_filtering branch from 49be114 to 8471265 Compare April 6, 2020 16:38
CasperWA
CasperWA previously approved these changes Apr 7, 2020
Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all right by me, thanks @ml-evs ! :)

Great with the additional error handler as well.

@CasperWA
Copy link
Member

CasperWA commented Apr 7, 2020

EDIT: this would not be possible unless collections can see/talk to each other, as the first step needs to query e.g. references and the second step needs to query structures.

Yeah - then you need to move into SQL territory to benefit the most from it :) Or create mixed-collections upon starting the server?

@ml-evs ml-evs merged commit 05a2cbc into master Apr 7, 2020
@ml-evs ml-evs deleted the ml-evs/relationship_filtering branch April 7, 2020 17:58
@CasperWA CasperWA mentioned this pull request Apr 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants