log4j.logger.org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy=INFO
DataSourceV2Strategy Execution Planning Strategy
DataSourceV2Strategy
is an execution planning strategy that Spark Planner uses to plan logical operators (from the Data Source API V2).
Logical Operator | Physical Operator |
---|---|
|
|
Repartition with a |
|
Tip
|
Enable Add the following line to Refer to Logging. |
Applying DataSourceV2Strategy Strategy to Logical Plan (Executing DataSourceV2Strategy) — apply
Method
apply(plan: LogicalPlan): Seq[SparkPlan]
Note
|
apply is part of GenericStrategy Contract to generate a collection of SparkPlans for a given logical plan.
|
apply
branches off per the given logical operator.
DataSourceV2Relation Logical Operator
For a DataSourceV2Relation logical operator, apply
requests the DataSourceV2Relation
for the DataSourceReader.
apply
then pushFilters followed by pruneColumns.
apply
prints out the following INFO message to the logs:
Pushing operators to [ClassName of DataSourceV2]
Pushed Filters: [pushedFilters]
Post-Scan Filters: [postScanFilters]
Output: [output]
apply
uses the DataSourceV2Relation
to create a DataSourceV2ScanExec physical operator.
If there are any postScanFilters
, apply
creates a FilterExec physical operator with the DataSourceV2ScanExec
physical operator as the child.
In the end, apply
creates a ProjectExec physical operator with the FilterExec
with the DataSourceV2ScanExec
or directly with the DataSourceV2ScanExec
physical operator.
StreamingDataSourceV2Relation Logical Operator
For a StreamingDataSourceV2Relation
logical operator, apply
…FIXME
WriteToDataSourceV2 Logical Operator
For a WriteToDataSourceV2 logical operator, apply
simply creates a WriteToDataSourceV2Exec physical operator.
AppendData Logical Operator
For a AppendData logical operator with a DataSourceV2Relation, apply
requests the DataSourceV2Relation to create a DataSourceWriter that is used to create a WriteToDataSourceV2Exec physical operator.
WriteToContinuousDataSource Logical Operator
For a WriteToContinuousDataSource
logical operator, apply
…FIXME
Repartition Logical Operator
For a Repartition logical operator, apply
…FIXME
pushFilters
Internal Method
pushFilters(
reader: DataSourceReader,
filters: Seq[Expression]): (Seq[Expression], Seq[Expression])
Note
|
pushFilters handles DataSourceReaders with SupportsPushDownFilters support only.
|
For the given DataSourceReaders
with SupportsPushDownFilters
support, pushFilters
uses the DataSourceStrategy
object to translate every filter in the given filters
.
pushFilters
requests the SupportsPushDownFilters
reader to pushFilters first and then for the pushedFilters.
In the end, pushFilters
returns a pair of filters pushed and not.
Note
|
pushFilters is used exclusively when DataSourceV2Strategy execution planning strategy is executed (applied to a DataSourceV2Relation logical operator).
|