TableScan Contract — Relations with Column Pruning

TableScan is the contract of BaseRelations with support for column pruning, i.e. can eliminate unneeded columns before producing an RDD containing all of its tuples as Row objects.

package org.apache.spark.sql.sources

trait PrunedScan {
  def buildScan(): RDD[Row]
}
Table 1. TableScan Contract
Property Description

buildScan

Building distributed data scan with column pruning

In other words, buildScan creates a RDD[Row] to represent a distributed data scan (i.e. scanning over data in a relation).

Used exclusively when DataSourceStrategy execution planning strategy is requested to plan a LogicalRelation with a TableScan.

Note
KafkaRelation is the one and only known implementation of the TableScan Contract in Spark SQL.

results matching ""

    No results matching ""