add native `split_part` to the cross db utils #299

dave-connors-3 · 2023-03-20T13:43:44Z

Describe the feature

Right now, dbt offers native support for any cross-db macros, including split_part. The implementation in dbt-spark right now is fairly complex, given that there's not native support for split_part in spark. This gets especially complex for negative arguments to the function, which relies on some heavy string parsing to properly translate a negative arg into the appropriate index in the array. Given that databricks natively supports this function including negative part_numbers, it may be a good idea to implement here to avoid this complex logic.

Describe alternatives you've considered

Do nothing, inherit from dbt-spark

Additional context

Please include any other relevant context here.

Who will this benefit?

package maintainers like me!

Are you interested in contributing this feature?

For sure

The text was updated successfully, but these errors were encountered:

dbeatty10 · 2023-07-11T15:57:58Z

dbt-labs/dbt-spark #689 added support for negative part_number argument to dbt-spark. Like @dave-connors-3 mentioned, dbt-databricks will just inherit this logic, so it is not strictly necessary to override it.

But providing a native implementation might be as simple as copy-pasting the logic from the default implementation:

dbt/include/databricks/macros/utils/split_part.sql

{% macro databricks__split_part(string_text, delimiter_text, part_number) %}

    split_part(
        {{ string_text }},
        {{ delimiter_text }},
        {{ part_number }}
        )

{%- endmacro %}

Or just inherit the default from dbt-core:

{% macro databricks__split_part(string_text, delimiter_text, part_number) %}

    {{ dbt.default__split_part(string_text, delimiter_text, part_number) }}

{%- endmacro %}

⚠️ Caveat

I didn't check one way or the other if the semantics of split_part in Databricks would necessitate bifurcated logic like in here.

This inherited test in dbt-databricks includes a negative test case as-of dbt-core v1.6, so it should catch cases where any new implementation is off.

github-actions · 2024-01-08T01:48:05Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue.

dave-connors-3 added the enhancement New feature or request label Mar 20, 2023

github-actions bot added the Stale label Jan 8, 2024

benc-db closed this as not planned Won't fix, can't repro, duplicate, stale Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add native `split_part` to the cross db utils #299

add native `split_part` to the cross db utils #299

dave-connors-3 commented Mar 20, 2023

dbeatty10 commented Jul 11, 2023

github-actions bot commented Jan 8, 2024

add native split_part to the cross db utils #299

add native split_part to the cross db utils #299

Comments

dave-connors-3 commented Mar 20, 2023

Describe the feature

Describe alternatives you've considered

Additional context

Who will this benefit?

Are you interested in contributing this feature?

dbeatty10 commented Jul 11, 2023

⚠️ Caveat

github-actions bot commented Jan 8, 2024

add native `split_part` to the cross db utils #299

add native `split_part` to the cross db utils #299