val q = spark.range(5).select("*")
val plan = q.queryExecution.logical
scala> println(plan.numberedTreeString)
00 'Project [*]
01 +- AnalysisBarrier
02 +- Range (0, 5, step=1, splits=Some(8))
import org.apache.spark.sql.catalyst.analysis.UnresolvedStar
val starExpr = plan.expressions.head.asInstanceOf[UnresolvedStar]
val namedExprs = starExpr.expand(input = q.queryExecution.analyzed, spark.sessionState.analyzer.resolver)
scala> println(namedExprs.head.numberedTreeString)
00 id#0: bigint
UnresolvedStar Expression
UnresolvedStar
is a Star expression that represents a star (i.e. all) expression in a logical query plan.
UnresolvedStar
is created when:
UnresolvedStar
can never be resolved, and is expanded at analysis (when ResolveReferences logical resolution rule is executed).
Note
|
UnresolvedStar can only be used in Project, Aggregate or ScriptTransformation logical operators.
|
Given UnresolvedStar
can never be resolved it should not come as a surprise that it cannot be evaluated either (i.e. produce a value given an internal row). When requested to evaluate, UnresolvedStar
simply reports a UnsupportedOperationException
.
Cannot evaluate expression: [this]
When created, UnresolvedStar
takes name parts that, once concatenated, is the target of the star expansion.
import org.apache.spark.sql.catalyst.analysis.UnresolvedStar
scala> val us = UnresolvedStar(None)
us: org.apache.spark.sql.catalyst.analysis.UnresolvedStar = *
scala> val ab = UnresolvedStar(Some("a" :: "b" :: Nil))
ab: org.apache.spark.sql.catalyst.analysis.UnresolvedStar = List(a, b).*
Tip
|
Use
You could also use |
Note
|
|
Star Expansion — expand
Method
expand(input: LogicalPlan, resolver: Resolver): Seq[NamedExpression]
Note
|
expand is part of Star Contract to…FIXME.
|
expand
first expands to named expressions per target:
-
For unspecified target,
expand
gives the output schema of theinput
logical query plan (that assumes that the star refers to a relation / table) -
For target with one element,
expand
gives the table (attribute) in the output schema of theinput
logical query plan (using qualifiers) if available
With no result earlier, expand
then requests the input
logical query plan to resolve the target name parts to a named expression.
For a named expression of StructType data type, expand
creates an Alias expression with a GetStructField unary expression (with the resolved named expression and the field index).
val q = Seq((0, "zero")).toDF("id", "name").select(struct("id", "name") as "s")
val analyzedPlan = q.queryExecution.analyzed
import org.apache.spark.sql.catalyst.analysis.UnresolvedStar
import org.apache.spark.sql.catalyst.dsl.expressions._
val s = star("s").asInstanceOf[UnresolvedStar]
val exprs = s.expand(input = analyzedPlan, spark.sessionState.analyzer.resolver)
// star("s") should expand to two Alias(GetStructField) expressions
// s is a struct of id and name in the query
import org.apache.spark.sql.catalyst.expressions.{Alias, GetStructField}
val getStructFields = exprs.collect { case Alias(g: GetStructField, _) => g }.map(_.sql)
scala> getStructFields.foreach(println)
`s`.`id`
`s`.`name`
expand
reports a AnalysisException
when:
-
The data type of the named expression (when the
input
logical plan was requested to resolve the target) is not a StructType.Can only star expand struct data types. Attribute: `[target]`
-
Earlier attempts gave no results
cannot resolve '[target].*' given input columns '[from]'