QueryPlanner — From Logical to Physical Plans

QueryPlanner transforms a logical query through a chain of GenericStrategy objects to produce a physical execution plan, i.e. SparkPlan for SparkPlanner or the Hive-Specific SparkPlanner.

QueryPlanner Contract

QueryPlanner contract defines the following operations:

Note
Protected collectPlaceholders and prunePlans are supposed to be defined by subclasses and are used in the concrete plan method.

strategies Method

strategies: Seq[GenericStrategy[PhysicalPlan]]

strategies abstract method returns a collection of GenericStrategy objects (that are used in plan method).

plan Method

plan(plan: LogicalPlan): Iterator[PhysicalPlan]

plan returns an Iterator[PhysicalPlan] with elements being the result of applying each GenericStrategy object from strategies collection to plan input parameter.

collectPlaceholders Method

collectPlaceholders(plan: PhysicalPlan): Seq[(PhysicalPlan, LogicalPlan)]

collectPlaceholders returns a collection of pairs of a given physical and a corresponding logical plans.

prunePlans Method

prunePlans(plans: Iterator[PhysicalPlan]): Iterator[PhysicalPlan]

prunePlans prunes bad physical plans.

SparkStrategies — Container of SparkStrategy Strategies

SparkStrategies is an abstract base QueryPlanner (of SparkPlan) that serves as a "container" (or a namespace) of the concrete SparkStrategy objects:

  1. SpecialLimits

  2. JoinSelection

  3. StatefulAggregationStrategy

  4. Aggregation

  5. InMemoryScans

  6. StreamingRelationStrategy

  7. BasicOperators

  8. DDLStrategy

Note
Strategy is a type alias of SparkStrategy that is defined in org.apache.spark.sql package object.
Note
SparkPlanner is the one and only concrete implementation of SparkStrategies.
Caution
FIXME What is singleRowRdd for?

results matching ""

    No results matching ""