spark.sessionState.planner.StreamingDeduplicationStrategy
StreamingDeduplicationStrategy Execution Planning Strategy for Deduplicate Logical Operator
StreamingDeduplicationStrategy
is an execution planning strategy that can plan streaming queries with Deduplicate
logical operators (over streaming queries) to StreamingDeduplicateExec physical operators.
Tip
|
Read up on Execution Planning Strategies in The Internals of Spark SQL book. |
Note
|
Deduplicate logical operator represents Dataset.dropDuplicates operator in a logical query plan. |
StreamingDeduplicationStrategy
is available using SessionState
.