Skip to content

Latest commit

 

History

History
23 lines (17 loc) · 1010 Bytes

spark-sql-streaming-StreamingDeduplicationStrategy.adoc

File metadata and controls

23 lines (17 loc) · 1010 Bytes

StreamingDeduplicationStrategy Execution Planning Strategy for Deduplicate Logical Operator

StreamingDeduplicationStrategy is an execution planning strategy (i.e. Strategy) that IncrementalExecution uses to plan Deduplicate logical operators in streaming Datasets.

Note

Deduplicate logical operator is the result of dropDuplicates operator.

StreamingDeduplicationStrategy is available using SessionState.

spark.sessionState.planner.StreamingDeduplicationStrategy

StreamingDeduplicationStrategy resolves streaming Deduplicate unary logical operators to StreamingDeduplicateExec physical operators.

FIXME