scala> val emp = spark.emptyDataset[Seq[String]]
emp: org.apache.spark.sql.Dataset[Seq[String]] = [value: array<string>]
scala> emp.select(explode($"value")).show
+---+
|col|
+---+
+---+
scala> emp.select(explode($"value")).explain(true)
== Parsed Logical Plan ==
'Project [explode('value) AS List()]
+- LocalRelation <empty>, [value#77]
== Analyzed Logical Plan ==
col: string
Project [col#89]
+- Generate explode(value#77), false, false, [col#89]
+- LocalRelation <empty>, [value#77]
== Optimized Logical Plan ==
LocalRelation <empty>, [col#89]
== Physical Plan ==
LocalTableScan <empty>, [col#89]
PropagateEmptyRelation Logical Optimization
PropagateEmptyRelation
is a base logical optimization that collapses plans with empty LocalRelation logical operators, e.g. explode or join.
PropagateEmptyRelation
is part of the LocalRelation fixed-point batch in the standard batches of the Catalyst Optimizer.
PropagateEmptyRelation
is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan]
.
Explode
Join
scala> spark.emptyDataset[Int].join(spark.range(1)).explain(extended = true)
...
TRACE SparkOptimizer:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation ===
!Join Inner LocalRelation <empty>, [value#40, id#42L]
!:- LocalRelation <empty>, [value#40]
!+- Range (0, 1, step=1, splits=Some(8))
TRACE SparkOptimizer: Fixed point reached for batch LocalRelation after 2 iterations.
DEBUG SparkOptimizer:
=== Result of Batch LocalRelation ===
!Join Inner LocalRelation <empty>, [value#40, id#42L]
!:- LocalRelation <empty>, [value#40]
!+- Range (0, 1, step=1, splits=Some(8))
...
== Parsed Logical Plan ==
Join Inner
:- LocalRelation <empty>, [value#40]
+- Range (0, 1, step=1, splits=Some(8))
== Analyzed Logical Plan ==
value: int, id: bigint
Join Inner
:- LocalRelation <empty>, [value#40]
+- Range (0, 1, step=1, splits=Some(8))
== Optimized Logical Plan ==
LocalRelation <empty>, [value#40, id#42L]
== Physical Plan ==
LocalTableScan <empty>, [value#40, id#42L]
Executing Rule — apply
Method
apply(plan: LogicalPlan): LogicalPlan
Note
|
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).
|
apply
…FIXME