scala> :type spark
org.apache.spark.sql.SparkSession
val q = spark.range(1).select(struct($"id"))
val logicalPlan = q.queryExecution.logical
scala> println(logicalPlan.numberedTreeString)
00 'Project [unresolvedalias(named_struct(NamePlaceholder, 'id), None)]
01 +- AnalysisBarrier
02 +- Range (0, 1, step=1, splits=Some(8))
// Let's resolve references first
import spark.sessionState.analyzer.ResolveReferences
val planWithRefsResolved = ResolveReferences(logicalPlan)
import org.apache.spark.sql.catalyst.analysis.ResolveCreateNamedStruct
val afterResolveCreateNamedStruct = ResolveCreateNamedStruct(planWithRefsResolved)
scala> println(afterResolveCreateNamedStruct.numberedTreeString)
00 'Project [unresolvedalias(named_struct(id, id#4L), None)]
01 +- AnalysisBarrier
02 +- Range (0, 1, step=1, splits=Some(8))
ResolveCreateNamedStruct Logical Resolution Rule — Resolving NamePlaceholders In CreateNamedStruct Expressions
ResolveCreateNamedStruct
is a logical resolution rule that replaces NamePlaceholders with Literals for the names in CreateNamedStruct expressions in an entire logical query plan.
ResolveCreateNamedStruct
is part of the Resolution fixed-point batch in the standard batches of the Analyzer.
ResolveCreateNamedStruct
is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan]
.
Executing Rule — apply
Method
apply(plan: LogicalPlan): LogicalPlan
Note
|
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).
|
apply
traverses all Catalyst expressions (in the input LogicalPlan) that are CreateNamedStruct expressions which are not resolved yet and replaces NamePlaceholders
with Literal expressions.
In other words, apply
finds unresolved CreateNamedStruct expressions with NamePlaceholder
expressions in the children and replaces them with the name of corresponding NamedExpression, but only if the NamedExpression
is resolved.
In the end, apply
creates a CreateNamedStruct with new children.