CoalesceExec Unary Physical Operator

CoalesceExec is a unary physical operator (i.e. with one child physical operator) to…​FIXME…​with numPartitions number of partitions and a child spark plan.

CoalesceExec represents Repartition logical operator at execution (when shuffle was disabled — see BasicOperators execution planning strategy). When executed, it executes the input child and calls coalesce on the result RDD (with shuffle disabled).

Please note that since physical operators present themselves without the suffix Exec, CoalesceExec is the Coalesce in the Physical Plan section in the following example:

scala> df.rdd.getNumPartitions
res6: Int = 8

scala> df.coalesce(1).rdd.getNumPartitions
res7: Int = 1

scala> df.coalesce(1).explain(extended = true)
== Parsed Logical Plan ==
Repartition 1, false
+- LocalRelation [value#1]

== Analyzed Logical Plan ==
value: int
Repartition 1, false
+- LocalRelation [value#1]

== Optimized Logical Plan ==
Repartition 1, false
+- LocalRelation [value#1]

== Physical Plan ==
Coalesce 1
+- LocalTableScan [value#1]

output collection of Attribute matches the child's (since CoalesceExec is about changing the number of partitions not the internal representation).

outputPartitioning returns a SinglePartition when the input numPartitions is 1 while a UnknownPartitioning partitioning scheme for the other cases.

results matching ""

    No results matching ""