Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model versions #7287

Merged
merged 18 commits into from
Apr 12, 2023
Merged

model versions #7287

merged 18 commits into from
Apr 12, 2023

Conversation

MichelleArk
Copy link
Contributor

@MichelleArk MichelleArk commented Apr 6, 2023

resolves #7263

Description

  • Splits out ModelPatchParser and UnparsedModelUpdate for model-specific parsing. Also introduces VersionedTestBlock.
  • Returns a ParseResult instead of a list of test blocks from base parser parse method
  • Updates node.refs from List[List[str]] to List[RefArgs]

TODO

  • Handle disabled node lookup for versioned nodes
  • Partial parsing for versioned blocks
  • fully overridable UnparsedVersions tests
  • new integration + unit tests

Checklist

🎩

schema.yml:

models:
  - name: dim_customers
    latest_version: 2
    config:
      materialized: table
      test: test
    tests:
      - unique:
          column_name: "location_id"
    columns:
      - name: customer_id
        description: This is the primary key
        data_type: int
      - name: location_id
        description: id of location
        data_type: int
        tests: 
          - not_null
    versions:
      - v: 3
        columns: 
          - include: '*'
            exclude: ['location_id']
          - name: extra
        defined_in: arbitrary_file_name
        tests: []
      - v: 2
        config: 
          materialized: view
        columns:
          - include: '*'

models/dim_customers_v2.sql

select 1 as location_id

models/arbitrary_file_name.sql

SELECT 1 as a
UNION ALL 
SELECT null as a

models/test_version.sql

select * from {{ ref('dim_customers', v=3) }}
UNION ALL 
select * from {{ ref('dim_customers') }}

@cla-bot cla-bot bot added the cla:yes label Apr 6, 2023
@MichelleArk MichelleArk changed the title first pass - model versions model versions Apr 10, 2023
source_file.append_patch(
versioned_model_patch.yaml_key, versioned_model_node.unique_id
)
self.manifest.rebuild_ref_lookup()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: look into whether this can be more granular and moved into the loop for a performance improvement

@@ -405,6 +427,11 @@ def patch(self, patch: "ParsedNodePatch"):
self.created_at = time.time()
self.description = patch.description
self.columns = patch.columns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider moving this to parse_patch code on PatchParser

@MichelleArk MichelleArk marked this pull request as ready for review April 11, 2023 16:12
@MichelleArk MichelleArk requested review from a team as code owners April 11, 2023 16:12
@MichelleArk MichelleArk requested review from nathaniel-may, emmyoop, gshank and peterallenwebb and removed request for nathaniel-may April 11, 2023 16:12
Copy link
Contributor

@gshank gshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good! I just had one question about "build_model_str"...

core/dbt/parser/generic_test_builders.py Outdated Show resolved Hide resolved
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a small partial parsing bug, and you just fixed it.

That's ... really it! I've done my best to break this, and it's been holding up well. A handful of comments, none of them a blocker to merging.

Nice work :)

core/dbt/context/providers.py Show resolved Hide resolved
core/dbt/contracts/graph/unparsed.py Outdated Show resolved Hide resolved
core/dbt/contracts/graph/unparsed.py Show resolved Hide resolved
core/dbt/contracts/graph/unparsed.py Show resolved Hide resolved
core/dbt/node_types.py Show resolved Hide resolved
Comment on lines 432 to 433
# TODO: version, is_latest_version, and access are specific to ModelNodes, consider splitting out to ModelNode
if self.resource_type != NodeType.Model:
Copy link
Contributor

@jtcohen6 jtcohen6 Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would we expect these validation warnings to come up in practice (if ever)? e.g. I've tried adding versions / version attributes to a seed patch, and I don't see any warnings

core/dbt/parser/schemas.py Outdated Show resolved Hide resolved
@MichelleArk
Copy link
Contributor Author

Added support for ref resolution for python models in this commit (thank you @jtcohen6!!). Updated adapter tests as well, and verified those work in Snowflake (BigQuery skips these tests, Spark CI is broken - but there is no adapter-specific code so this should work across all adapters if it works for Snowflake)

@MichelleArk
Copy link
Contributor Author

Latest change adds a latest_version attribute to the ModelNode, and switches is_latest_version to a property: 6e49ea5. Was considering this during spiking but couldn't find a reason to keep latest_version on ModelNode at that point. Keeping latest_version on the model directly makes a whole lot of sense to implement new version selector method (#7330), so I've added it in this PR.

@MichelleArk MichelleArk merged commit c7ebc89 into main Apr 12, 2023
@MichelleArk MichelleArk deleted the 6866/model-versions2 branch April 12, 2023 13:50
dbeatty10 added a commit to dbt-labs/docs.getdbt.com that referenced this pull request Apr 16, 2023
resolves #3194

[Page
preview](https://deploy-preview-3195--docs-getdbt-com.netlify.app/docs/collaborate/govern/model-versions#how-to-create-a-new-version-of-a-model)

## What are you changing in this pull request and why?

Update code examples for `defined_in` and `exclude` to align with the
implementation (and code examples) in
[dbt-labs/dbt-core#7287](dbt-labs/dbt-core#7287):

- e27c2da Example using `defined_in` property
- 9da3e00 Fix the example to `exclude` column(s)
- d0f5396 Use postgres-specific data types (since it is a neutral
database platform and its data types can be easily ported)

## Checklist

- [x] Review the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
and [About
versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
so my content adheres to these guidelines.
- [x] Add a checklist item for anything that needs to happen before this
PR is merged, such as "needs technical review" or "change base branch."
@MichelleArk MichelleArk mentioned this pull request Apr 20, 2023
6 tasks
Copy link
Contributor

@dbeatty10 dbeatty10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found some un-submitted review comments relating to changie 😅

time: 2023-04-06T10:10:19.794672-04:00
custom:
Author: michelleark
Issue: '#7263'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Issue: '#7263' work?

Or does it need to be like the following?

Suggested change
Issue: '#7263'
Issue: "7263"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To close the loop, Issue: '#7263' didn't work:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-2352] Versioned Models - parsing and referencing
4 participants