QueryPlan — Structured Query Plan

QueryPlan is part of Catalyst to build a tree of relational operators of a structured query.

Scala-specific, QueryPlan is an abstract class that is the base class of LogicalPlan and SparkPlan (for logical and physical plans, respectively).

A QueryPlan has an output attributes (that serves as the base for the schema), a collection of expressions and a schema.

QueryPlan has statePrefix that is used when displaying a plan with ! to indicate an invalid plan, and ' to indicate an unresolved plan.

A QueryPlan is invalid if there are missing input attributes and children subnodes are non-empty.

A QueryPlan is unresolved if the column names have not been verified and column types have not been looked up in the Catalog.

A QueryPlan has zero, one or more Catalyst expressions.

Note
QueryPlan is a tree of operators that have a tree of expressions.

QueryPlan has references property that is the attributes that appear in expressions from this operator.

QueryPlan Contract

abstract class QueryPlan[T] extends TreeNode[T] {
  def output: Seq[Attribute]
  def validConstraints: Set[Expression]
  // FIXME
}
Table 1. QueryPlan Contract
Method Description

validConstraints

output

Attribute expressions

Transforming Expressions — transformExpressions Method

transformExpressions(rule: PartialFunction[Expression, Expression]): this.type

transformExpressions simply executes transformExpressionsDown with the input rule.

Note
transformExpressions is used when…​FIXME

Transforming Expressions — transformExpressionsDown Method

transformExpressionsDown(rule: PartialFunction[Expression, Expression]): this.type

transformExpressionsDown applies the rule to each expression in the query operator.

Note
transformExpressionsDown is used when…​FIXME

Applying Transformation Function to Each Expression in Query Operator — mapExpressions Method

mapExpressions(f: Expression => Expression): this.type

mapExpressions…​FIXME

Note
mapExpressions is used when…​FIXME

Output Schema Attribute Set — outputSet Property

outputSet: AttributeSet

outputSet simply returns an AttributeSet for the output schema attributes.

Note
outputSet is used when…​FIXME

producedAttributes Property

Caution
FIXME

Missing Input Attributes — missingInput Property

def missingInput: AttributeSet

missingInput are attributes that are referenced in expressions but not provided by this node’s children (as inputSet) and are not produced by this node (as producedAttributes).

Output Schema — schema Property

You can request the schema of a QueryPlan using schema that builds StructType from the output attributes.

// the query
val dataset = spark.range(3)

scala> dataset.queryExecution.analyzed.schema
res6: org.apache.spark.sql.types.StructType = StructType(StructField(id,LongType,false))

Output Schema Attributes — output Property

output: Seq[Attribute]

output is a collection of Catalyst attribute expressions that represent the result of a projection in a query that is later used to build the output schema.

Note
output property is also called output schema or result schema.
val q = spark.range(3)

scala> q.queryExecution.analyzed.output
res0: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L)

scala> q.queryExecution.withCachedData.output
res1: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L)

scala> q.queryExecution.optimizedPlan.output
res2: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L)

scala> q.queryExecution.sparkPlan.output
res3: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L)

scala> q.queryExecution.executedPlan.output
res4: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L)
Tip

You can build a StructType from output collection of attributes using toStructType method (that is available through the implicit class AttributeSeq).

scala> q.queryExecution.analyzed.output.toStructType
res5: org.apache.spark.sql.types.StructType = StructType(StructField(id,LongType,false))

Simple (Basic) Description with State Prefix — simpleString Method

simpleString: String
Note
simpleString is part of TreeNode Contract for the simple text description of a tree node.

simpleString adds a state prefix to the node’s simple text description.

State Prefix — statePrefix Method

statePrefix: String

Internally, statePrefix gives ! (exclamation mark) when the node is invalid, i.e. missingInput is not empty, and the node is a parent node. Otherwise, statePrefix gives an empty string.

Note
statePrefix is used exclusively when QueryPlan is requested for the simple text node description.

Transforming All Expressions — transformAllExpressions Method

transformAllExpressions(rule: PartialFunction[Expression, Expression]): this.type

transformAllExpressions…​FIXME

Note
transformAllExpressions is used when…​FIXME

Simple (Basic) Description with State Prefix — verboseString Method

verboseString: String
Note
verboseString is part of TreeNode Contract to…​FIXME.

verboseString simply returns the simple (basic) description with state prefix.

innerChildren Method

innerChildren: Seq[QueryPlan[_]]
Note
innerChildren is part of TreeNode Contract to…​FIXME.

innerChildren simply returns the subqueries.

subqueries Method

subqueries: Seq[PlanType]

subqueries…​FIXME

Note
subqueries is used when…​FIXME

Canonicalizing Query Plan — doCanonicalize Method

doCanonicalize(): PlanType

doCanonicalize…​FIXME

Note
doCanonicalize is used when…​FIXME

results matching ""

    No results matching ""