UnresolvedStar Expression

UnresolvedStar is a Star expression that represents a star (i.e. all) expression in a logical query plan.

UnresolvedStar is created when:

val q = spark.range(5).select("*")
val plan = q.queryExecution.logical
scala> println(plan.numberedTreeString)
00 'Project [*]
01 +- AnalysisBarrier
02       +- Range (0, 5, step=1, splits=Some(8))

import org.apache.spark.sql.catalyst.analysis.UnresolvedStar
val starExpr = plan.expressions.head.asInstanceOf[UnresolvedStar]

val namedExprs = starExpr.expand(input = q.queryExecution.analyzed, spark.sessionState.analyzer.resolver)
scala> println(namedExprs.head.numberedTreeString)
00 id#0: bigint

UnresolvedStar can never be resolved, and is expanded at analysis (when ResolveReferences logical resolution rule is executed).

Note
UnresolvedStar can only be used in Project, Aggregate or ScriptTransformation logical operators.

Given UnresolvedStar can never be resolved it should not come as a surprise that it cannot be evaluated either (i.e. produce a value given an internal row). When requested to evaluate, UnresolvedStar simply reports a UnsupportedOperationException.

Cannot evaluate expression: [this]

When created, UnresolvedStar takes name parts that, once concatenated, is the target of the star expansion.

import org.apache.spark.sql.catalyst.analysis.UnresolvedStar
scala> val us = UnresolvedStar(None)
us: org.apache.spark.sql.catalyst.analysis.UnresolvedStar = *

scala> val ab = UnresolvedStar(Some("a" :: "b" :: Nil))
ab: org.apache.spark.sql.catalyst.analysis.UnresolvedStar = List(a, b).*
Tip

Use star operator from Catalyst DSL’s expressions to create an UnresolvedStar.

import org.apache.spark.sql.catalyst.dsl.expressions._
val s = star()
scala> :type s
org.apache.spark.sql.catalyst.expressions.Expression

import org.apache.spark.sql.catalyst.analysis.UnresolvedStar
assert(s.isInstanceOf[UnresolvedStar])

val s = star("a", "b")
scala> println(s)
WrappedArray(a, b).*

You could also use $"" or ' to create an UnresolvedStar, but that requires sbt console (with Spark libraries defined in build.sbt) as the Catalyst DSL expressions implicits interfere with the Spark implicits to create columns.

Note

AstBuilder replaces count(*) (with no DISTINCT keyword) to count(1).

val q = sql("SELECT COUNT(*) FROM RANGE(1,2,3)")
scala> println(q.queryExecution.logical.numberedTreeString)
00 'Project [unresolvedalias('count(1), None)]
01 +- 'UnresolvedTableValuedFunction range, [1, 2, 3]

val q = sql("SELECT COUNT(DISTINCT *) FROM RANGE(1,2,3)")
scala> println(q.queryExecution.logical.numberedTreeString)
00 'Project [unresolvedalias('COUNT(*), None)]
01 +- 'UnresolvedTableValuedFunction RANGE, [1, 2, 3]

Star Expansion — expand Method

expand(input: LogicalPlan, resolver: Resolver): Seq[NamedExpression]
Note
expand is part of Star Contract to…​FIXME.

expand first expands to named expressions per target:

  • For unspecified target, expand gives the output schema of the input logical query plan (that assumes that the star refers to a relation / table)

  • For target with one element, expand gives the table (attribute) in the output schema of the input logical query plan (using qualifiers) if available

With no result earlier, expand then requests the input logical query plan to resolve the target name parts to a named expression.

For a named expression of StructType data type, expand creates an Alias expression with a GetStructField unary expression (with the resolved named expression and the field index).

val q = Seq((0, "zero")).toDF("id", "name").select(struct("id", "name") as "s")
val analyzedPlan = q.queryExecution.analyzed

import org.apache.spark.sql.catalyst.analysis.UnresolvedStar
import org.apache.spark.sql.catalyst.dsl.expressions._
val s = star("s").asInstanceOf[UnresolvedStar]
val exprs = s.expand(input = analyzedPlan, spark.sessionState.analyzer.resolver)

// star("s") should expand to two Alias(GetStructField) expressions
// s is a struct of id and name in the query

import org.apache.spark.sql.catalyst.expressions.{Alias, GetStructField}
val getStructFields = exprs.collect { case Alias(g: GetStructField, _) => g }.map(_.sql)
scala> getStructFields.foreach(println)
`s`.`id`
`s`.`name`

expand reports a AnalysisException when:

  • The data type of the named expression (when the input logical plan was requested to resolve the target) is not a StructType.

    Can only star expand struct data types. Attribute: `[target]`
  • Earlier attempts gave no results

    cannot resolve '[target].*' given input columns '[from]'

results matching ""

    No results matching ""