persist view column comments #866

jurasan · 2023-08-09T20:56:55Z

resolves #372

Problem

View column description is not persisted in view schema.

Solution

View column description is persisted as a column level comment.
Used this dbt-snowflake commit as an example.

Checklist

I have read the contributing guide and understand what's expected of me
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

I was not able to run tests on it. Here is the PR with the test, but in tests it cannot call dbt-core macros.

@internalcode
    def _fail_with_undefined_error(
        self, *args: t.Any, **kwargs: t.Any
    ) -> "te.NoReturn":
        """Raise an :exc:`UndefinedError` when operations are performed
        on the undefined value.
        """
>       raise self._undefined_exception(self._undefined_message)
E       jinja2.exceptions.UndefinedError: 'get_columns_in_query' is undefined

../../miniconda3/lib/python3.10/site-packages/jinja2/runtime.py:852: UndefinedError

cla-bot · 2023-08-09T20:56:59Z

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR.

CLA has not been signed by users: @jurasan

jurasan · 2023-08-10T07:30:40Z

@cla-bot check:

cla-bot · 2023-08-10T07:30:46Z

The cla-bot has been summoned, and re-checked this pull request!

colin-rogers-dbt · 2023-08-10T20:19:09Z

@jurasan can you add a changelog summarizing this change?

mikealfare · 2023-08-10T20:39:09Z

@jurasan The failing code check is for whitespace in dbt/include/spark/macros/adapters.sql. You can either fix it manually or run pre-commit run --all-files to have it fix it for you.

Also, I don't understand why the test in your linked PR wouldn't be able to run in this PR. We regularly call dbt-core macros. Could you talk more about that?

Fleid · 2023-08-10T23:38:02Z

@JCZuurmond for information only

jurasan · 2023-08-14T10:27:23Z

@mikealfare

@jurasan The failing code check is for whitespace in dbt/include/spark/macros/adapters.sql. You can either fix it manually or run pre-commit run --all-files to have it fix it for you.

fixed this

Also, I don't understand why the test in your linked PR wouldn't be able to run in this PR. We regularly call dbt-core macros. Could you talk more about that?

Yes, we do call dbt-core macros, but If you look at the test_macros.py I don't think those are tested.
Maybe it's because of the way those macros are called in __run_macro method?

dbt/include/spark/macros/adapters.sql

colin-rogers-dbt · 2023-08-17T22:10:54Z

Set this to auto-merge, looks like all tests are passing. @jurasan can you update your branch with the latest from main?

JCZuurmond

@jurasan : Thank you for your contribution.

If I understand correctly, the code does roughly the following:
If the persist_docs.columns flag is set to true, then comments are added to the views by matching the columns outputted by the view query with the columns in the matching model definition (yml file).

I have some doubts about executing the SQL when creating a view. A benefit of views is that they are computational cheap to create. It would be an unexpected side effect to me of setting persist_docs.columns to true that the view SQL is executed when creating a view.

JCZuurmond · 2023-08-27T19:04:40Z

dbt/include/spark/macros/adapters.sql


 {% macro spark__create_view_as(relation, sql) -%}
  create or replace view {{ relation }}
+  {% if config.persist_column_docs() -%}
+    {% set model_columns = model.columns %}


Where are the model columns coming from?

JCZuurmond · 2023-08-27T19:06:03Z

dbt/include/spark/macros/adapters.sql


 {% macro spark__create_view_as(relation, sql) -%}
  create or replace view {{ relation }}
+  {% if config.persist_column_docs() -%}
+    {% set model_columns = model.columns %}
+    {% set query_columns = get_columns_in_query(sql) %}


Does this mean that the SQL of the view executed when creating the view with persist_docs.columns = true?

Issuing the get_columns_in_query query against one of the largest table internally at Databricks returned in 148 ms. It is sufficiently performant.

Regardless of the time, does it execute the SQL to get the columns?

It executes a limit 0.

JCZuurmond · 2023-08-27T19:16:17Z

dbt/include/spark/macros/adapters.sql

@@ -229,9 +229,43 @@
  {% endfor %}
 {% endmacro %}

+{% macro get_matched_column(column_name, column_dict) %}


Spark columns are case sensitive, I would not do the upper/lower matching

benc-db · 2023-09-19T22:18:09Z

dbt/include/spark/macros/adapters.sql

@@ -229,9 +229,29 @@
  {% endfor %}
 {% endmacro %}

+{% macro get_column_comment_sql(column_name, column_dict) -%}
+  {% if column_name in column_dict and column_dict[column_name]["description"] -%}
+    {% set column_comment_clause = "comment '" ~ column_dict[column_name]["description"] ~ "'" %}


Filter with

| replace("'", "\\'")

is this what you mean? could TLDR's the filter's purpose for me?

Suggested change

{% set column_comment_clause = "comment '" ~ column_dict[column_name]["description"] ~ "'" %}

{% set column_comment_clause = "comment '" ~ column_dict[column_name]["description"] ~ "'" | replace("'", "\\'")%}

Its to escape single-quotes to allow them to be used in the comments.

you need it to operate only on the description, I think here you have it operating either on the whole string or on the final ' (I'm not the best at Jinja). In my local copy I moved the extraction and filtering to separate 'set' statement.

{% set escaped_description = column_dict[column_name]["description"] | replace("'", "\\'") %} {% set column_comment_clause = "comment '" ~ escaped_description ~ "'" %}

colin-rogers-dbt · 2023-09-26T20:13:13Z

resolved in #893

persist view column comments

82ee516

jurasan requested a review from a team as a code owner August 9, 2023 20:56

jurasan requested a review from McKnight-42 August 9, 2023 20:56

cla-bot bot added the cla:yes label Aug 10, 2023

jurasan mentioned this pull request Aug 10, 2023

[CT-764] Persist Column level comments when creating views #372

Closed

mikealfare added the backport 1.6.latest label Aug 10, 2023

format: whitespace

2bc4914

mikealfare reviewed Aug 14, 2023

View reviewed changes

mikealfare self-assigned this Aug 14, 2023

jurasan added 2 commits August 17, 2023 12:08

extracted get_matched_column macro

2b8fef6

move parenthesis to the calling macro

a15cada

jurasan requested a review from mikealfare August 17, 2023 12:26

changelog

9c5138f

jurasan closed this Aug 17, 2023

jurasan reopened this Aug 17, 2023

colin-rogers-dbt enabled auto-merge (squash) August 17, 2023 22:04

Merge branch 'main' into persist_view_column_comments

5658e5b

mikealfare added the ok to test label Aug 21, 2023

JCZuurmond reviewed Aug 27, 2023

View reviewed changes

colin-rogers-dbt disabled auto-merge August 31, 2023 23:38

jurasan added 2 commits September 8, 2023 16:32

fix: remove matching column in different case

50e602e

fix: remove get_matched_column macro - not much logic left there.

1ea4178

Merge branch 'main' into persist_view_column_comments

9916483

jurasan requested a review from JCZuurmond September 8, 2023 17:29

Merge branch 'main' into persist_view_column_comments

f8ff20a

benc-db reviewed Sep 19, 2023

View reviewed changes

benc-db mentioned this pull request Sep 21, 2023

Added comment_clause adapter to support persist_docs databricks/dbt-databricks#353

Closed

colin-rogers-dbt mentioned this pull request Sep 25, 2023

persist view column comments #893

Merged

4 tasks

colin-rogers-dbt closed this Sep 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

persist view column comments #866

persist view column comments #866

jurasan commented Aug 9, 2023 •

edited

Loading

cla-bot bot commented Aug 9, 2023

jurasan commented Aug 10, 2023

cla-bot bot commented Aug 10, 2023

colin-rogers-dbt commented Aug 10, 2023

mikealfare commented Aug 10, 2023 •

edited

Loading

Fleid commented Aug 10, 2023

jurasan commented Aug 14, 2023 •

edited

Loading

colin-rogers-dbt commented Aug 17, 2023

JCZuurmond left a comment

JCZuurmond Aug 27, 2023

JCZuurmond Aug 27, 2023

benc-db Sep 19, 2023

JCZuurmond Sep 22, 2023

benc-db Sep 25, 2023

JCZuurmond Aug 27, 2023

benc-db Sep 19, 2023

dataders Sep 21, 2023

benc-db Sep 21, 2023

benc-db Sep 21, 2023 •

edited

Loading

benc-db Sep 21, 2023

colin-rogers-dbt commented Sep 26, 2023

	{% set column_comment_clause = "comment '" ~ column_dict[column_name]["description"] ~ "'" %}
	{% set column_comment_clause = "comment '" ~ column_dict[column_name]["description"] ~ "'" \| replace("'", "\\'")%}

persist view column comments #866

persist view column comments #866

Conversation

jurasan commented Aug 9, 2023 • edited Loading

Problem

Solution

Checklist

cla-bot bot commented Aug 9, 2023

jurasan commented Aug 10, 2023

cla-bot bot commented Aug 10, 2023

colin-rogers-dbt commented Aug 10, 2023

mikealfare commented Aug 10, 2023 • edited Loading

Fleid commented Aug 10, 2023

jurasan commented Aug 14, 2023 • edited Loading

colin-rogers-dbt commented Aug 17, 2023

JCZuurmond left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benc-db Sep 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colin-rogers-dbt commented Sep 26, 2023

jurasan commented Aug 9, 2023 •

edited

Loading

mikealfare commented Aug 10, 2023 •

edited

Loading

jurasan commented Aug 14, 2023 •

edited

Loading

benc-db Sep 21, 2023 •

edited

Loading