Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dagster-dbt] rework dagster dbt logging, cleanup #8681

Merged
merged 3 commits into from
Jun 30, 2022

Conversation

OwenKephart
Copy link
Contributor

@OwenKephart OwenKephart commented Jun 29, 2022

Summary & Motivation

resolves: #8601 and unticked requests for the ability to turn off --log-format json.

This is a whole rabbit hole, but essentially the way we were handling the output of the dbt cli command was a bit confusing and somewhat incorrect. This corrects some issues and refactors a bit of code around that area.

First, this resolves a longstanding issue where, if your dbt command had an error, the error message would not show human-readable information (as we were not correctly parsing the json lines). The below screenshot is taken after the changes. Prior to the changes, the parsed output (the thing that you can actually read) would be empty, and the main visible metadata would be the bottom thing (which is pretty unreadable).

Screen Shot 2022-06-29 at 3 49 45 PM

Next, this allows the user to tell Dagster not to insert the "--log-format json" flag when invoking dbt. In an ideal world, there'd be no reason to turn this off, because there should be no material impact on execution when using this flag, and we should be able to parse out json-formatted log lines and display them in dagit better than just reading raw text.

However, it turns out that due to a bug in the dbt-bigquery adapter, this log format flag can actually cause dbt to produce unhelpful errors in cases where it should emit helpful ones: dbt-labs/dbt-bigquery#206

I submitted a PR to correct this behavior on dbt's side, but it's possible that there are other places where this could crop up (and it's unclear exactly when my PR would get merged in), so it's now a config option. I tried to keep the dagit experience relatively consistent when this option is turned off, but some sacrifices had to be made. For example, we can't tell what log level each message was originally at, so all logs are set to the INFO level when you do this. There's also a dbt-produced timestamp at the start of each of these lines, which I decided not to attempt to remove:

Screen Shot 2022-06-29 at 3 59 18 PM

Finally, this PR adds a new target argument to load_assets_from_dbt_manifest. This is due to an unfortunate git mishap, but it's a tiny low risk change so it's here now.

How I Tested These Changes

unit tests, dagit workflow

@vercel
Copy link

vercel bot commented Jun 29, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

3 Ignored Deployments
Name Status Preview Updated
dagit-storybook ⬜️ Ignored (Inspect) Jun 30, 2022 at 3:45PM (UTC)
dagster ⬜️ Ignored (Inspect) Jun 30, 2022 at 3:45PM (UTC)
dagster-oss-cloud-consolidated ⬜️ Ignored (Inspect) Jun 30, 2022 at 3:45PM (UTC)

@OwenKephart OwenKephart requested a review from sryza June 29, 2022 23:01
@OwenKephart OwenKephart changed the title [dagster-dbt] rework dagster dbt logging, cleanup, add target param [dagster-dbt] rework dagster dbt logging, cleanup Jun 30, 2022
@OwenKephart OwenKephart merged commit eeb23e6 into master Jun 30, 2022
@OwenKephart OwenKephart deleted the owen/dagster-dbt-json-log-bug branch June 30, 2022 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

silence (or enable silencing) dbt command outputs when loading models into assets
2 participants