Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

truncate table seed to avoid append logic in unmanaged schemas #917

Closed
wants to merge 13 commits into from
7 changes: 7 additions & 0 deletions .changes/unreleased/Fixes-20231013-120628.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
kind: Fixes
body: Overwrite existing rows on existing seed tables. For unmanaged databases (no location specified), the current seed command in
dbt-spark appends to existing seeded tables instead overwriting.
time: 2023-10-13T12:06:28.078483-06:00
custom:
Author: mv1742
Issue: "112"
7 changes: 7 additions & 0 deletions dbt/include/spark/macros/adapters.sql
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,13 @@
{%- endcall %}
{% endmacro %}


{% macro spark__truncate_relation(relation) -%}
{% call statement('truncate_relation', auto_begin=False) -%}
truncate {{ relation.type }} if exists {{ relation }}
{%- endcall %}
{% endmacro %}

{% macro spark__drop_relation(relation) -%}
{% call statement('drop_relation', auto_begin=False) -%}
drop {{ relation.type }} if exists {{ relation }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@
re: python models and temporary views.

Also, why do neither drop_relation or adapter.drop_relation work here?!
'unmanaged' tables in spark need to manually delete the database
otherwise drop statement does not delete the underlying data.
TODO:add warning that this feature does not work for Unmanaged tables.
Managed tables are fine.
--#}
{% call statement('drop_relation') -%}
drop table if exists {{ tmp_relation }}
Expand Down
5 changes: 4 additions & 1 deletion dbt/include/spark/macros/materializations/seed.sql
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,10 @@

{% macro spark__reset_csv_table(model, full_refresh, old_relation, agate_table) %}
{% if old_relation %}
{{ adapter.truncate_relation(old_relation) }}
{{ adapter.drop_relation(old_relation) }}

{{ return(sql) }}
{% endif %}
{% set sql = create_csv_table(model, agate_table) %}
{{ return(sql) }}
Expand All @@ -27,7 +30,7 @@
{% endfor %}

{% set sql %}
insert into {{ this.render() }} values
insert {% if loop.index0 == 0 -%} overwrite {% else -%} into {% endif -%} {{ this.render() }} values
{% for row in chunk -%}
({%- for col_name in agate_table.column_names -%}
{%- set inferred_type = adapter.convert_type(agate_table, loop.index0) -%}
Expand Down