val q = Seq((0, "zero"), (1, "one")).toDF("id", "name").sort('id)
val qe = q.queryExecution
val logicalPlan = qe.analyzed
scala> println(logicalPlan.numberedTreeString)
00 Sort [id#72 ASC NULLS FIRST], true
01 +- Project [_1#69 AS id#72, _2#70 AS name#73]
02 +- LocalRelation [_1#69, _2#70]
// BasicOperators does the conversion of Sort logical operator to SortExec
val sparkPlan = qe.sparkPlan
scala> println(sparkPlan.numberedTreeString)
00 Sort [id#72 ASC NULLS FIRST], true, 0
01 +- LocalTableScan [id#72, name#73]
// SortExec supports Whole-Stage Code Generation
val executedPlan = qe.executedPlan
scala> println(executedPlan.numberedTreeString)
00 *(1) Sort [id#72 ASC NULLS FIRST], true, 0
01 +- Exchange rangepartitioning(id#72 ASC NULLS FIRST, 200)
02 +- LocalTableScan [id#72, name#73]
import org.apache.spark.sql.execution.SortExec
val sortExec = executedPlan.collect { case se: SortExec => se }.head
assert(sortExec.isInstanceOf[SortExec])
SortExec Unary Physical Operator
SortExec
is a unary physical operator that is created when:
-
BasicOperators execution planning strategy is requested to plan a Sort logical operator
-
FileFormatWriter helper object is requested to write the result of a structured query
-
EnsureRequirements physical query optimization is executed (and enforces partition requirements for data distribution and ordering of a physical operator)
SortExec
supports Java code generation (aka codegen).
When requested for the output attributes, SortExec
simply gives whatever the child operator uses.
SortExec
uses the sorting order expressions for the output data ordering requirements.
When requested for the output data partitioning requirements, SortExec
simply gives whatever the child operator uses.
When requested for the required partition requirements, SortExec
gives the OrderedDistribution (with the sorting order expressions for the ordering) when the global flag is enabled (true
) or the UnspecifiedDistribution.
SortExec
operator uses the spark.sql.sort.enableRadixSort internal configuration property (enabled by default) to control…FIXME
Key | Name (in web UI) | Description |
---|---|---|
|
peak memory |
|
|
sort time |
|
|
spill size |
Generating Java Source Code for Produce Path in Whole-Stage Code Generation — doProduce
Method
doProduce(ctx: CodegenContext): String
Note
|
doProduce is part of CodegenSupport Contract to generate the Java source code for produce path in Whole-Stage Code Generation.
|
doProduce
…FIXME
Creating SortExec Instance
SortExec
takes the following when created:
-
Sorting order expressions (
Seq[SortOrder]
) -
Child physical plan