// Use Catalyst DSL to define a logical plan
// HACK: Disable symbolToColumn implicit conversion
// It is imported automatically in spark-shell (and makes demos impossible)
// implicit def symbolToColumn(s: Symbol): org.apache.spark.sql.ColumnName
trait ThatWasABadIdea
implicit def symbolToColumn(ack: ThatWasABadIdea) = ack
import org.apache.spark.sql.catalyst.dsl.expressions._
import org.apache.spark.sql.catalyst.dsl.plans._
import org.apache.spark.sql.catalyst.plans.logical.LocalRelation
val rel = LocalRelation('a.int, 'b.int, 'c.int)
import org.apache.spark.sql.catalyst.expressions.{In, Literal}
val plan = rel
.where(In('a, Seq[Literal](1, 2, 3)))
.analyze
scala> println(plan.numberedTreeString)
00 Filter a#6 IN (1,2,3)
01 +- LocalRelation <empty>, [a#6, b#7, c#8]
// In --> InSet
spark.conf.set("spark.sql.optimizer.inSetConversionThreshold", 0)
import org.apache.spark.sql.catalyst.optimizer.OptimizeIn
val optimizedPlan = OptimizeIn(plan)
scala> println(optimizedPlan.numberedTreeString)
00 Filter a#6 INSET (1,2,3)
01 +- LocalRelation <empty>, [a#6, b#7, c#8]
OptimizeIn Logical Optimization
OptimizeIn
is a base logical optimization that transforms logical plans with In predicate expressions as follows:
-
Replaces an
In
expression that has an empty list and the value expression not nullable tofalse
-
Eliminates duplicates of Literal expressions in an In predicate expression that is inSetConvertible
-
Replaces an
In
predicate expression that is inSetConvertible with InSet expressions when the number of literal expressions in the list expression is greater than spark.sql.optimizer.inSetConversionThreshold internal configuration property (default:10
)
OptimizeIn
is part of the Operator Optimization before Inferring Filters fixed-point batch in the standard batches of the Catalyst Optimizer.
OptimizeIn
is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan]
.
Executing Rule — apply
Method
apply(plan: LogicalPlan): LogicalPlan
Note
|
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).
|
apply
…FIXME