[macro] [false positives] environment-aware logic in config should not cause resources to Always selected by state:modified #9564

graciegoheen · 2024-02-13T02:49:57Z

Is this your first time submitting a feature request?

I have read the expectations for open source contributors
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Many folks want to have logic in their dbt project that purposely differs by environment.

For example, let's say I want to materialize a model as a view in dev but as a table in prod.

To accomplish this, I use a macro in my config block:

{{
    config(
        materialized = set_materialized_config()
    )
}}

with the following macro:

{% macro set_materialized_config() %}
  {% if target.name == 'prod' %}
    {% set mat = 'table' %}
  {% else %}
    {% set mat = 'view' %}
  {% endif %}
  
  {{ return(mat) }}
{% endmacro %}

When selecting --select state:modified and comparing my dev environment to a manifest from prod, this model will ALWAYS be marked as modified. This is because, the materialized config appears different - in prod it's table, in dev it's view.

Instead, if I've changed nothing about this model OR macro, it shouldn't be marked as modified because the macro logic hasn't changed.

If I've changed the model OR the macro, it should be marked as modified.

Describe alternatives you've considered

A current work-around is to pull the raw jinja out of my macro and put all environment-based logic directly into my dbt_project.yml file (can't set configs to output of macros in dbt_project.yml).

If instead, I were to configure this model as so:

models:
  <path_to_my_model>:
    +materialized: "{{ 'table' if target.name == 'prod' else 'view' }}"

it would NOT be picked up when selecting --select state:modified.

But this is cumbersome (reduces DRY code), unexpected, and can lead to an unnecessarily massive dbt_project.yml file.

Who will this benefit?

Anyone who wants to use state:modified and has environment-based logic in their dbt project.

See internal use case:

Are you interested in contributing this feature?

No response

Anything else?

Potential solution #6170 (comment)

This is relevant for all resources that can be configured, not just models.

The text was updated successfully, but these errors were encountered:

jtcohen6 · 2024-02-13T10:35:43Z

Historical context on why we've said that this is hard, at least in the past:

Use static analyzer to extract unrendered_config from config() #3680
Parse the dbt ast for config calls and evaluate them outside of the parsing context #2714

We'd need some way to statically extract + save the "unrendered" value of materialized, as set_materialized_config(). Right now, both macros are called + resolved in the same pass.

An alternative approach is to let users put macros in their yaml configs, such as:

# models/path/to/my_model.yml
models:
  - name: my_model
    config:
      materialized: "{{ set_materialized_config() }}"

This would make it easier for us to save "{{ set_materialized_config() }}" as the unrendered materialized config. It's also much DRYer for the end user, compared with copy-pasting the same Jinja if expression over and over. But it also risks being substantially trickier & slower to parse, which is the biggest reason why we haven't done it in the past.

If we do go down that route, we might want a different UX — a "snippet"? a "pure macro"? — to make clear that these macros can only be static input-output machines. They can reference vars + env vars + target values, but they cannot make introspective queries against the data warehouse. This category already includes the "special" generate_x_name macros (for database/schema/alias), which dbt fully resolves at parse time instead of at runtime.

graciegoheen · 2024-04-04T17:24:05Z

Related to #3277

sasawatc · 2024-07-03T05:00:07Z

Got a similar use case to @graciegoheen mentioned dynamically set my snowflake warehouse based on my environment (assuming from the title as I don't have access). Personally, having a way to specify all of the model-specific config override at the .sql file itself makes it intuitive on the override settings of that particular model.

marius-sb1 · 2024-07-04T13:58:59Z

If we do go down that route, we might want a different UX — a "snippet"? a "pure macro"? — to make clear that these macros can only be static input-output machines. They can reference vars + env vars + target values, but they cannot make introspective queries against the data warehouse. This category already includes the "special" generate_x_name macros (for database/schema/alias), which dbt fully resolves at parse time instead of at runtime.

This might be a different issue, but it could also be related and the quoted paragraph touches on the generate_x_name-macros which definately are related to my problem. We have a set-up where we override generate_database_name to separate environments and projects into separate databases (it uses target.name from a dynamically generated profiles.yml + database/custom database).

While we have the exact same database config across our environments, it seems that only the output from get_database_name is stored in the manifest and therefore marks all our models as modified when comparing dev to prod. The use case here is slim CI.

graciegoheen added enhancement New feature or request triage labels Feb 13, 2024

graciegoheen mentioned this issue Feb 13, 2024

[Epic] state:modified should Actually (only) select the modified resources #9562

Open

graciegoheen removed the triage label Feb 13, 2024

graciegoheen changed the title ~~[macro] environment-aware logic in config should not cause resources to Always selected by state:modified~~ [macro] [false positives] environment-aware logic in config should not cause resources to Always selected by state:modified Feb 14, 2024

graciegoheen added state Stateful selection (state:modified, defer) state: modified labels Feb 14, 2024

MichelleArk mentioned this issue Sep 25, 2024

[state:modified] persist unrendered_config from schema.yml, and more reliably compute unrendered_config from .sql files #10487

Merged

5 tasks

MichelleArk closed this as completed in #10487 Sep 26, 2024

dbeatty10 mentioned this issue Oct 3, 2024

[Core] state_modified_compare_more_unrendered_values behavior change flag dbt-labs/docs.getdbt.com#6185

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[macro] [false positives] environment-aware logic in config should not cause resources to Always selected by state:modified #9564

[macro] [false positives] environment-aware logic in config should not cause resources to Always selected by state:modified #9564

graciegoheen commented Feb 13, 2024 •

edited

Loading

jtcohen6 commented Feb 13, 2024 •

edited

Loading

graciegoheen commented Apr 4, 2024

sasawatc commented Jul 3, 2024

marius-sb1 commented Jul 4, 2024

[macro] [false positives] environment-aware logic in config should not cause resources to Always selected by state:modified #9564

[macro] [false positives] environment-aware logic in config should not cause resources to Always selected by state:modified #9564

Comments

graciegoheen commented Feb 13, 2024 • edited Loading

Is this your first time submitting a feature request?

Describe the feature

Describe alternatives you've considered

Who will this benefit?

Are you interested in contributing this feature?

Anything else?

jtcohen6 commented Feb 13, 2024 • edited Loading

graciegoheen commented Apr 4, 2024

sasawatc commented Jul 3, 2024

marius-sb1 commented Jul 4, 2024

graciegoheen commented Feb 13, 2024 •

edited

Loading

jtcohen6 commented Feb 13, 2024 •

edited

Loading