Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate query interface params #717

Merged
merged 28 commits into from
Aug 25, 2023
Merged

Consolidate query interface params #717

merged 28 commits into from
Aug 25, 2023

Conversation

DevonFulcher
Copy link
Contributor

@DevonFulcher DevonFulcher commented Aug 9, 2023

Resolves SLA-218 (internal to dbt Labs)

Description

This PR makes the Dimension, TimeDimension, and Entity objects that are passed into the where filter Jinja template implement protocols. These protocols allow for different implementations of these objects depending on the context (group_by, where, order_by parameters) while also constraining these implementations to all have the same method signatures. This will create consistency across these contexts.

The Dimension protocol is also implemented for parameters in certain methods (i.e. GroupByOrderByDimension). This will allow for more complex Dimension objects in the future that aren't feasible to serialize into a string, such as Dimension('demographic').grain('month').alias('monthly_demographics'). These parameters are added as optional parameters to these methods, so everything should be backward compatible.

I added some unit tests. This change is mostly refactoring existing behavior to implement protocols and adding additional optional parameters. These optional parameters just transform into the existing parameters. So, I relied on the existing tests to ensure no breaking behavior.

@cla-bot cla-bot bot added the cla:yes label Aug 9, 2023
@github-actions
Copy link

github-actions bot commented Aug 9, 2023

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

metricflow/specs/query_interface.py Show resolved Hide resolved
metricflow/specs/query_interface.py Outdated Show resolved Hide resolved
metricflow/specs/query_interface.py Outdated Show resolved Hide resolved
metricflow/specs/where_filter_dimension.py Outdated Show resolved Hide resolved
@DevonFulcher DevonFulcher marked this pull request as ready for review August 14, 2023 16:17
@tlento tlento self-requested a review August 14, 2023 21:43
Copy link
Contributor

@WilliamDee WilliamDee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's this whole thing about inheriting the Protocol classes in the implementation class that we had, don't really remember what we landed at there, but @tlento or @plypaul would know. In DSI, Paul made the ProtocolHint class for this, not sure if we would want to use that here too.

metricflow/query/query_parser.py Outdated Show resolved Hide resolved
@DevonFulcher
Copy link
Contributor Author

Added ProtocolHint to the implementations.

Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkpoint review, I'm not quite done.

I'm not sold on the factory class pattern but I have nothing better to offer that will work with jinja, so this is probably one of those cases where it's better for us to proceed with minor readability updates and revisit if we need to.

.changes/unreleased/Features-20230814-111602.yaml Outdated Show resolved Hide resolved
metricflow/engine/metricflow_engine.py Outdated Show resolved Hide resolved
metricflow/engine/metricflow_engine.py Outdated Show resolved Hide resolved
metricflow/engine/metricflow_engine.py Show resolved Hide resolved
metricflow/query/query_parser.py Outdated Show resolved Hide resolved
metricflow/specs/group_by_order_by_dimension.py Outdated Show resolved Hide resolved
metricflow/specs/group_by_order_by_dimension.py Outdated Show resolved Hide resolved
metricflow/specs/query_interface.py Outdated Show resolved Hide resolved
metricflow/specs/where_filter_transform.py Outdated Show resolved Hide resolved
metricflow/specs/where_filter_dimension.py Show resolved Hide resolved
@DevonFulcher
Copy link
Contributor Author

I'm not sold on the factory class pattern

That pattern isn't strictly necessary. It helps to ensure that all of the constructors in Jinja have the same interfaces. I think it is worth the added indirection for greater consistency, but I can remove it if you prefer.

Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our discussion, we are going to do the following here:

  1. Clean up the protocols a bit in some way in order to ensure there are property accessors for MetricFlow to use for building specs and so forth behind the API, and property configuration methods for the callers to implement so users can configure things on the client side
  2. Leave anything other than minor cleanup for follow-up PRs
  3. Figure out if we want to move to something like group_by: Optional[Sequence[Union[str, NonAggregateParameter]]] and, if so, have a separate PR where we get that all working nicely

I'm requesting changes just so I can get an alert when this is ready - please request review from me when you update. GitHub makes you do this manually, which always annoys me, but I haven't found a better mechanism.

@DevonFulcher
Copy link
Contributor Author

I changed the factories to return individualized protocols instead of all sharing the QueryParameter protocol. While the parameters are similar in some respects (they all might have an alias), they can also be different (only Dimension has grain). I wanted these protocols to be separate so they could evolve independently. This can be changed later.

Instead, I use QueryParameter as the read-only protocol for MF to get grain and name for an object. Since, both QueryParameter and QueryInterfaceDimension have a grain property, they can't be implemented by the same class. This is somewhat limiting, but I found it an okay tradeoff to have a more Pythonic interface instead of QueryParameter having a get_grain property or something similar.

Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking much better, thank you!

I'm mildly concerned about the layers of indirection being confusing to readers of the future, particularly if they read by module instead of in coordination as these changes appear in this PR.

For now I think this is best addressed via documentation, and I've left a couple of minor pointers inline.

Unless you feel like you want another review pass you may update and merge at your convenience! This might merit a squash and merge operation, or at least a rebase -i and force push to clean up the history in this branch. It's a little gnarly commit-wise.

Thanks!

metricflow/engine/metricflow_engine.py Show resolved Hide resolved
metricflow/specs/where_filter_entity.py Show resolved Hide resolved
metricflow/specs/where_filter_entity.py Outdated Show resolved Hide resolved
metricflow/test/query/test_query_parser.py Show resolved Hide resolved
@DevonFulcher DevonFulcher merged commit 3e2cea0 into main Aug 25, 2023
9 checks passed
@DevonFulcher DevonFulcher deleted the feature/sla-218 branch August 25, 2023 00:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants