StreamingRelationStrategy Execution Planning Strategy for StreamingRelation and StreamingExecutionRelation Logical Operators

StreamingRelationStrategy is an execution planning strategy that can plan streaming queries with StreamingRelation, StreamingExecutionRelation, and StreamingRelationV2 logical operators to StreamingRelationExec physical operators.

StreamingRelationStrategy apply.png
Figure 1. StreamingRelationStrategy, StreamingRelation, StreamingExecutionRelation and StreamingRelationExec Operators

StreamingRelationStrategy is used exclusively when IncrementalExecution is requested to plan a streaming query.

StreamingRelationStrategy is available using SessionState (of a SparkSession).

spark.sessionState.planner.StreamingRelationStrategy

Demo: Using StreamingRelationStrategy

val rates = spark.
  readStream.
  format("rate").
  load // <-- gives a streaming Dataset with a logical plan with StreamingRelation logical operator

// StreamingRelation logical operator for the rate streaming source
scala> println(rates.queryExecution.logical.numberedTreeString)
00 StreamingRelation DataSource(org.apache.spark.sql.SparkSession@31ba0af0,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L]

// StreamingRelationExec physical operator (shown without "Exec" suffix)
scala> rates.explain
== Physical Plan ==
StreamingRelation rate, [timestamp#0, value#1L]

// Let's do the planning manually
import spark.sessionState.planner.StreamingRelationStrategy
val physicalPlan = StreamingRelationStrategy.apply(rates.queryExecution.logical).head
scala> println(physicalPlan.numberedTreeString)
00 StreamingRelation rate, [timestamp#0, value#1L]

results matching ""

    No results matching ""