You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
!Expression <TransformKeys> transform_keys(value#160, lambdafunction(split(lambda k#254, code, 2)[0], lambda k#254, lambda v#255, false)) cannot run on GPU because LAST_WIN is not supported for config setting spark.sql.mapKeyDedupPolicy
The text was updated successfully, but these errors were encountered:
andygrove
changed the title
[FEA] Support spark.sql.mapKeyDedupPolicy=LAST_WIN
[FEA] Support spark.sql.mapKeyDedupPolicy=LAST_WIN for TransformKeysApr 28, 2022
I updated the title to make it clear that this is asking for support for LAST_WIN for transform_keys. We actually already support LAST_WIN for create_map so we should have everything we need from cuDF.
For create_map we call dropListDuplicatesWithKeysValues to remove duplicates and this already supports the same semantics as LAST_WIN:
// Apache Spark desires to keep the last duplicate element.
auto [out_keys, out_vals] =
cudf::lists::drop_list_duplicates(keys, vals, cudf::duplicate_keep_option::KEEP_LAST);
I wish we can support spark.sql.mapKeyDedupPolicy=LAST_WIN to do deduplicate of the map keys.
For example:
Not-supported-messages:
The text was updated successfully, but these errors were encountered: