Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Templates for Defining Metric Filters #506

Merged
merged 16 commits into from
May 11, 2023
Merged

Conversation

plypaul
Copy link
Contributor

@plypaul plypaul commented May 8, 2023

Description

This PR resolves the issue in #505. Since we renamed constraint -> filter, there are renames that need to be done to make function signatures more consistent, but that will be handled in a follow up PR.

@cla-bot cla-bot bot added the cla:yes label May 8, 2023
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 8, 2023 20:44 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 8, 2023 20:44 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 8, 2023 20:44 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 8, 2023 20:44 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 8, 2023 20:44 — with GitHub Actions Inactive
@plypaul plypaul marked this pull request as ready for review May 8, 2023 22:35
@plypaul plypaul requested a review from tlento May 8, 2023 22:35
@plypaul plypaul changed the title Use Templated Filters in Metric Filters Use Templates in Metric Filters May 8, 2023
@plypaul plypaul changed the title Use Templates in Metric Filters Use Templates for Defining Metric Filters May 8, 2023
import jinja2

from dbt_semantic_interfaces.references import DimensionReference, EntityReference
from metricflow.time.time_granularity import TimeGranularity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got moved 🙂

Suggested change
from metricflow.time.time_granularity import TimeGranularity
from dbt_semantic_interfaces.objects.time_granularity import TimeGranularity

Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on it....


import pydantic
from pydantic import BaseModel

from dbt_semantic_interfaces.pretty_print import pformat_big_objects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit message is now very confusing, but ok. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this commit is no longer needed since pformat_big_objects got moved from metricflow to dbt_semantic_interfaces.

Comment on lines +125 to +149
class _DummyCallRenderer(FilterFunctionCallRenderer):
def render_dimension_call(self, dimension_call_parameter_set: DimensionCallParameterSet) -> str: # noqa: D
dimension_call_parameter_sets.append(dimension_call_parameter_set)
return FilterRenderer._DUMMY_PLACEHOLDER

def render_time_dimension_call( # noqa: D
self, time_dimension_call_parameter_set: TimeDimensionCallParameterSet
) -> str:
time_dimension_call_parameter_sets.append(time_dimension_call_parameter_set)
return FilterRenderer._DUMMY_PLACEHOLDER

def render_entity_call(self, entity_call_parameter_set: EntityCallParameterSet) -> str: # noqa: D
entity_call_parameter_sets.append(entity_call_parameter_set)
return FilterRenderer._DUMMY_PLACEHOLDER

FilterRenderer.render(
templated_filter_sql=templated_filter,
call_renderer=_DummyCallRenderer(),
)

return FilterCallParameterSets(
dimension_call_parameter_sets=tuple(dimension_call_parameter_sets),
time_dimension_call_parameter_sets=tuple(time_dimension_call_parameter_sets),
entity_call_parameter_sets=tuple(entity_call_parameter_sets),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super weird. Maybe this is the only reasonable way to do this? Either way, I guess it'll do for now but it's really odd to overload the renderer for this extraction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a way to do this by parsing the AST that Jinja can generate for you. However, it seemed like a more complex approach so I didn't go into it too deeply. I agree that this is a little odd though.

metricflow/dataflow/builder/dataflow_plan_builder.py Outdated Show resolved Hide resolved
@tlento tlento self-requested a review May 10, 2023 03:37
Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a bunch of quibbles inline but all of them can (and probably should, given the merge/rebase you're about to experience) be addressed in follow-ups.

The main concern I have is the FilterRenderer interface is confusing. I think it's a combination of overloading FilterRenderer.render() to do things which are not rendering, plus the use of the not-terribly-pythonic function scoped inner class definitions.

I think the simplest cleanup there is to do some renaming and move those inner classes out of the functions and into a coherent module where they can live together.

Instead of FilterRenderer and a render method, how about FilterTransformer and a transform method? We could call it something else if we want, but that's the easiest.

All of this is follow-up-PR territory, though, so let's get this merged (with whatever minor tweaks you see fit to make) and unblock things.

Comment on lines +752 to +748
def convert_to_dimension_spec(parameter_set: DimensionCallParameterSet) -> DimensionSpec: # noqa: D
return DimensionSpec(
element_name=parameter_set.dimension_reference.element_name,
entity_links=parameter_set.entity_path,
)


def convert_to_time_dimension_spec(parameter_set: TimeDimensionCallParameterSet) -> TimeDimensionSpec: # noqa: D
return TimeDimensionSpec(
element_name=parameter_set.time_dimension_reference.element_name,
entity_links=parameter_set.entity_path,
time_granularity=parameter_set.time_granularity,
)


def convert_to_entity_spec(parameter_set: EntityCallParameterSet) -> EntitySpec: # noqa: D
return EntitySpec(
element_name=parameter_set.entity_reference.element_name,
entity_links=parameter_set.entity_path,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that it matters much, but these are kind of hanging in space. It seems like they should be factory methods on the spec classes themselves, or attached to some kind of converter class (or module) dedicated to operating on these parameter sets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll just move them into the spec class.

Comment on lines +799 to +775
# Debating whether where_sql / bind_parameters belongs here. where_sql may become dialect specific if we introduce
# quoted identifiers later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is basically doing rendering, so we'll want to move it out to the rendering layer at some point. If we ever add a method to do time-based filters on measures or something we'll have to do dialect-specific rendering. Let's leave it here for now just to get this rolling.

column_association_resolver: ColumnAssociationResolver,
bind_parameters: SqlBindParameters = SqlBindParameters(),
) -> ResolvedWhereFilter:
class _CallRenderer(FilterFunctionCallRenderer): # noqa: D
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tell me you're a Java dev without telling me you're a Java dev.... 😹

This would look a lot less strange if the class was defined at module scope (preferably in the module with all the FilterRenderer logic). You can lead with double underscore if you want to tell people you really really don't want them using it, although if you do you probably should put a public interface in that module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see how best to organize that.

Comment on lines +817 to +797
def render_time_dimension_call( # noqa: D
self, time_dimension_call_parameter_set: TimeDimensionCallParameterSet
) -> str:
return column_association_resolver.resolve_time_dimension_spec(
convert_to_time_dimension_spec(time_dimension_call_parameter_set)
).column_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, I guess we can't put these transformation methods on the parameter set itself because of the split repo, eh? That would've streamlined things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's correct. We wanted to leave the ColumnAssociationResolver out of db_semantic_interfaces.

column_association_resolver=column_association_resolver,
)

assert resolved_where_filter.where_sql == "listing__created_at__month = '2020-01-01'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to dunder the month on the end here? Is this because we'll do the dialect-specific DATE_TRUNC() internally and alias the output to the __month? I'm worried about that causing issues with optimizer passes, although I think right now it won't be a problem.

I suspect this is a question for later - right now the system works on that rendered dunder so this should match current behavior, right?

Comment on lines 88 to 89
identifiers: Sequence[Entity] = [],
measures: Sequence[Measure] = [],
dimensions: Sequence[Dimension] = [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops

Comment on lines 158 to 166
# original_metric_constraint = WhereConstraintConverter.convert_to_spec_where_constraint(
# data_source_semantics=self._data_source_semantics,
# where_constraint=original_metric_obj.constraint,
# )
# combined_spec_constraint = (
# combined_spec_constraint.combine(original_metric_constraint)
# if combined_spec_constraint
# else original_metric_constraint
# )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably remove these if you haven't already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, will remove.

@@ -476,7 +483,7 @@ def _parse_and_validate_query(
order_by_specs=order_by_specs,
output_column_name_overrides=tuple(output_column_name_overrides),
time_range_constraint=time_constraint,
where_constraint=spec_where_constraint,
where_constraint=resolved_where_filter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell if all of the changes here are correct. Clearly, _parse_and_validate_query is due for some tidying. But not now.

Comment on lines +151 to +162
return f"{{{{ dimension('{dimension_name}', entity_path={repr(entity_path)}) }}}}"

def render_entity_template(self, entity_name: str, entity_path: Sequence[str] = ()) -> str:
"""Similar to render_dimension_template() but for entities."""
return f"{{{{ entity('{entity_name}', entity_path={repr(entity_path)}) }}}}"

def render_time_dimension_template(
self, time_dimension_name: str, time_granularity: str, entity_path: Sequence[str] = ()
) -> str:
"""Similar to render_dimension_template() but for time dimensions."""
return (
f"{{{{ time_dimension('{time_dimension_name}', '{time_granularity}', entity_path={repr(entity_path)}) }}}}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes. Nothing for it..... but that's a mark in favor of not using jinja if we don't end up needing to do so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up for other ideas!

@@ -49,76 +36,3 @@ def convert_to_measure_spec(measure: Measure) -> MeasureSpec:
element_name=measure.name,
non_additive_dimension_spec=non_additive_dimension_spec,
)


class WhereConstraintConverter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huzzah!

@plypaul
Copy link
Contributor Author

plypaul commented May 11, 2023

Yeah, the FilterRenderer class is a little odd right now. FilterTransformer is a possibility, but also thinking about having two separate classes FilterParser and FilterRenderer at the expense of some repetition. What do you think?

@plypaul
Copy link
Contributor Author

plypaul commented May 11, 2023

Actually, maybe FilterTransformer is sufficient.

@plypaul plypaul force-pushed the plypaul--06--templated-filters branch from d2e11ee to 761435b Compare May 11, 2023 07:26
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 07:27 — with GitHub Actions Inactive
@plypaul plypaul had a problem deploying to DW_INTEGRATION_TESTS May 11, 2023 07:27 — with GitHub Actions Failure
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 07:27 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 07:27 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 07:27 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 08:01 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 08:01 — with GitHub Actions Inactive
@plypaul plypaul had a problem deploying to DW_INTEGRATION_TESTS May 11, 2023 08:01 — with GitHub Actions Failure
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 08:01 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 08:01 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 08:24 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 08:24 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 08:24 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 08:24 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS May 11, 2023 08:24 — with GitHub Actions Inactive
@plypaul plypaul merged commit b73c5fa into main May 11, 2023
@plypaul plypaul deleted the plypaul--06--templated-filters branch May 11, 2023 09:09
raise ValueError(f"Expected input to be of type string, but got type {type(input)} with value: {input}")

@property
def call_parameter_sets(self) -> FilterCallParameterSets:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QMalcolm - FYI this was the original implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants