StreamingRelationStrategy Execution Planning Strategy for StreamingRelation and StreamingExecutionRelation Logical Operators
StreamingRelationStrategy
is an execution planning strategy that can plan streaming queries with StreamingRelation, StreamingExecutionRelation, and StreamingRelationV2 logical operators to StreamingRelationExec physical operators.
Figure 1. StreamingRelationStrategy, StreamingRelation, StreamingExecutionRelation and StreamingRelationExec Operators
Tip
|
Read up on Execution Planning Strategies in The Internals of Spark SQL book. |
StreamingRelationStrategy
is used exclusively when IncrementalExecution is requested to plan a streaming query.
StreamingRelationStrategy
is available using SessionState
(of a SparkSession
).
spark.sessionState.planner.StreamingRelationStrategy
Demo: Using StreamingRelationStrategy
val rates = spark.
readStream.
format("rate").
load // <-- gives a streaming Dataset with a logical plan with StreamingRelation logical operator
// StreamingRelation logical operator for the rate streaming source
scala> println(rates.queryExecution.logical.numberedTreeString)
00 StreamingRelation DataSource(org.apache.spark.sql.SparkSession@31ba0af0,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L]
// StreamingRelationExec physical operator (shown without "Exec" suffix)
scala> rates.explain
== Physical Plan ==
StreamingRelation rate, [timestamp#0, value#1L]
// Let's do the planning manually
import spark.sessionState.planner.StreamingRelationStrategy
val physicalPlan = StreamingRelationStrategy.apply(rates.queryExecution.logical).head
scala> println(physicalPlan.numberedTreeString)
00 StreamingRelation rate, [timestamp#0, value#1L]