Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected result of nested struct in BigQuery #105

Merged
merged 3 commits into from
Dec 21, 2022

Conversation

dbeatty10
Copy link
Contributor

@dbeatty10 dbeatty10 commented Dec 20, 2022

resolves #98

This is a:

  • documentation update
  • bug fix with no breaking changes
  • new functionality
  • a breaking change

All pull requests from community contributors should target the main branch (default).

Checklist

  • This code is associated with an issue which has been triaged and accepted for development.
  • I have verified that these changes work locally
  • I have updated the README.md (if applicable)
  • I have added tests & descriptions to my models (and macros if applicable)
  • I have added an entry to CHANGELOG.md

@Zatte
Copy link

Zatte commented Dec 20, 2022

Adding this as a note as it is more of feature/discussion than a bugfix.

I've seen that you can document nested columns in multiples ways.

One way being

models:
  - name: model_struct
    description: ""
    columns:
      - name: analytics
      - name: analytics.source
      - name: analytics.medium
      - name: analytics.source_medium

But there is also the possibility to do (note the nested use of the key columns inside a column)

models:
  - name: model_struct
    columns:
      - name: analytics
        columns:
          - name: source
          - name: medium
          - name: source_medium

Since this macro makes favors of one over the other it would be nice to make a small note about this decision. I believe both to be identical but I haven't looked into it sufficiently to be sure.

A feature request could be to support both or deprecate one method upstream(dbt-core) to encourage consistency. At the very least document (in the official docs) the canonical way for documenting nested fields (might be done but my google-fu wasn't strong enough).

@dbeatty10 dbeatty10 marked this pull request as ready for review December 20, 2022 23:31
@dbeatty10
Copy link
Contributor Author

I've seen that you can document nested columns in multiples ways.

Good point!

In my experiments, the two variants behaved differently when adding descriptions and then generating the documentation:

dbt docs generate
dbt docs serve

In my quick tests, your first example rendered the descriptions, but the second didn't.

Are you able to get it to include the descriptions when generating and serving the docs?

Here's the exact code I used:

models/model_struct.sql

select 
   struct(
      "a1" as source, 
      "b1" as medium, 
      "c1" as source_medium
   ) as analytics

Worked:
models/_models.yml

version: 2

models:
  - name: model_struct
    columns:
      - name: analytics
        description: "This is the name of the STRUCT"
      - name: analytics.source
        description: "This is the first attribute in the STRUCT"
      - name: analytics.medium
        description: "This is the second attribute in the STRUCT"
      - name: analytics.source_medium
        description: "This is the third attribute in the STRUCT"

image

Didn't work:
models/_models.yml

version: 2

models:
  - name: model_struct
    columns:
      - name: analytics
        description: "This is the name of the STRUCT"
        columns:
          - name: source
            description: "This is the first attribute in the STRUCT"
          - name: medium
            description: "This is the second attribute in the STRUCT"
          - name: source_medium
            description: "This is the third attribute in the STRUCT"

image

@dbeatty10 dbeatty10 merged commit 46586ea into main Dec 21, 2022
@Zatte
Copy link

Zatte commented Dec 21, 2022

Are you able to get it to include the descriptions when generating and serving the docs?

No, you are correct, the current macro approach works better and they are not the same! Thanks for clarifying and the great velocity on this Issue/PR 💯

@dbeatty10
Copy link
Contributor Author

Thanks for reporting this and all your detailed information @Zatte ! Wouldn't have happened without you 🏅

jeremyholtzman pushed a commit that referenced this pull request Apr 10, 2023
* Expected result of nested struct in BigQuery

* Restore the intended rendering of nested `STRUCT` fields in BigQuery

* Restore the intended rendering of nested `STRUCT` fields in BigQuery
@gwenwindflower gwenwindflower deleted the dbeatty/bigquery-nested-struct branch February 28, 2024 23:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alias nested sub-structs for bigquery
2 participants