PushPredicateThroughJoin Logical Optimization

PushPredicateThroughJoin is a Catalyst rule for transforming logical plans (i.e. Rule[LogicalPlan]).

When executed, PushPredicateThroughJoin…​FIXME

PushPredicateThroughJoin is a part of the Operator Optimization before Inferring Filters and Operator Optimization after Inferring Filters fixed-point rule batches of the base Catalyst Optimizer.

Demo: PushPredicateThroughJoin
import org.apache.spark.sql.catalyst.dsl.expressions._
import org.apache.spark.sql.catalyst.dsl.plans._

// Using hacks to disable two Catalyst DSL implicits
implicit def symbolToColumn(ack: ThatWasABadIdea) = ack
implicit class StringToColumn(val sc: StringContext) {}

import org.apache.spark.sql.catalyst.plans.logical.LocalRelation
val t1 = LocalRelation('a.int, 'b.int)
val t2 = LocalRelation('C.int, 'D.int).where('C > 10)

val plan = t1.join(t2)...FIXME

import org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughJoin
val optimizedPlan = PushPredicateThroughJoin(plan)
scala> println(optimizedPlan.numberedTreeString)

Executing Rule — apply Method

  plan: LogicalPlan): LogicalPlan
apply is part of the Rule contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).


split Internal Method

  condition: Seq[Expression],
  left: LogicalPlan,
  right: LogicalPlan): (Seq[Expression], Seq[Expression], Seq[Expression])

split splits (partitions) the given condition expressions into deterministic or not.

split further splits (partitions) the deterministic expressions (pushDownCandidates) into expressions that reference the output expressions of the left logical operator (leftEvaluateCondition) or not (rest).

split further splits (partitions) the expressions that do not reference left output expressions into expressions that reference the output expressions of the right logical operator (rightEvaluateCondition) or not (commonCondition).

In the end, split returns the leftEvaluateCondition, rightEvaluateCondition, and commonCondition with the non-deterministic condition expressions.

split is used when PushPredicateThroughJoin is executed.

results matching ""

    No results matching ""