Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] status and rowcount not reported beginning with version 1.5.0 #785

Open
2 tasks done
soroushmalek67 opened this issue Apr 24, 2024 · 1 comment
Open
2 tasks done
Labels
bug Something isn't working

Comments

@soroushmalek67
Copy link

soroushmalek67 commented Apr 24, 2024

Is this a new bug in dbt-redshift?

  • I believe this is a new bug in dbt-redshift
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Hello Everyone!
After updating to dbt 1.7 from 1.3, all of a sudden, for our jobs on our models (incremental, snapshots, tables) and for both dbt run and dbt build, dbt only returns this syntax: SUCCESS in XX.XXs, once the run is done for the model.

It used to be like this before: INSERT 0 14330 in 16.87s (first snapshot), which was very useful, especially for our incremental/snapshot models, as we wanted to check how many rows had been upserted in them.

Now, we don't see anything for our models in that way (second snapshot). note that we're using Registered adapter: redshift=1.7.5.

excerpts from dbt Labs response in community Slack thread in #dbt-cloud-cli-and-ide:

the message there is determined by the adapter_response field in the run_results.json object, and the message is determined entirely by whatever the adapter chooses to send back when you perform a given query (i.e the message for CREATE TABLE may be different in postgres vs snowflake!)

I believe this is a side effect of dbt-redshift changing how it interacts with Redshift, it switched to redshift-connector from psycopg2 in 1.5? Don't remember the exact adapter version.
The purpose of the change was ongoing compatibility as Redshift changes to a more Snowflake-y model (decoupling storage from compute, taking advantage of ra3 nodes, etc) and was driven by the AWS engineers who contribute to the adapter.
All this to say I think your best bet is to open an issue against dbt-redshift and ask for that metadata to be restored, but I assume it is not currently possible (otherwise it likely would have been preserved in the migration) and may require enhancements in an upstream project like redshift-connector itself.

Screenshot 2024-04-19 at 8 39 04 AM (1)
Screenshot 2024-04-19 at 8 39 42 AM (1)

Expected Behavior

To simply show how many rows were inserted in the incremental and snapshot jobs.

Steps To Reproduce

Based on dbt lab recommendation for troubleshooting:

  • The metadata should be restored, which might not be currently possible.
  • It may require enhancements in an upstream project, like redshift-connector itself

Relevant log output

No response

Environment

- OS: Mac
- Python: 3.9.18
- dbt-core: 1.7.13
- dbt-redshift: 1.7.7
@soroushmalek67 soroushmalek67 added bug Something isn't working triage labels Apr 24, 2024
@soroushmalek67 soroushmalek67 changed the title [Bug] <title> [Bug] <Restoring metadata in dbt-redshift adaptor 1.7.5> Apr 24, 2024
@dataders dataders changed the title [Bug] <Restoring metadata in dbt-redshift adaptor 1.7.5> [Bug] status and rowcount not reported beginning with version 1.5.0 Apr 25, 2024
@dataders
Copy link
Contributor

dataders commented Apr 25, 2024

@soroushmalek67 thanks for reporting this! This is an on-going regression due to a limitation of the redshift-connector Python driver. I've opened aws/amazon-redshift-python-driver#220 to request that SQL State be provided as a Cursor attribute.

Indeed this change began in #251, which was released initially in dbt-redshift 1.5.0. As @dave-connors-3 pointed out the status and rows modified comes from the driver's Cursor. there's a note on RedshiftConnectionManager.get_response()

redshift_connector.Cursor doesn't have a status message attribute but [because] this function is only used for successful run[s], so we can just return a dummy "SUCCESS"

def get_response(cls, cursor: redshift_connector.Cursor) -> AdapterResponse:
# redshift_connector.Cursor doesn't have a status message attribute but
# this function is only used for successful run, so we can just return a dummy
rows = cursor.rowcount
message = "SUCCESS"
return AdapterResponse(_message=message, rows_affected=rows)

Other data platforms similarly do not provide this information, accordingly the respective dbt adapters lack this information, namely:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants