createPartitioning(
numPartitions: Int): Partitioning
HashClusteredDistribution
HashClusteredDistribution
is a Distribution that creates a HashPartitioning for the hash expressions and a requested number of partitions.
HashClusteredDistribution
specifies None
for the required number of partitions.
Note
|
None for the required number of partitions indicates to use any number of partitions (possibly spark.sql.shuffle.partitions configuration property with the default of 200 partitions).
|
HashClusteredDistribution
is created when the following physical operators are requested for the required partition requirements of the child operator(s) (e.g. CoGroupExec, ShuffledHashJoinExec, SortMergeJoinExec and Spark Structured Streaming’s StreamingSymmetricHashJoinExec
).
HashClusteredDistribution
takes hash expressions when created.
HashClusteredDistribution
requires that the hash expressions should not be empty (i.e. Nil
).
HashClusteredDistribution
is used when:
-
EnsureRequirements
is requested to add an ExchangeCoordinator for Adaptive Query Execution -
HashPartitioning
is requested tosatisfies
createPartitioning
Method
Note
|
createPartitioning is part of Distribution Contract to create a Partitioning for a given number of partitions.
|
createPartitioning
creates a HashPartitioning
for the hash expressions and the input numPartitions
.