package org.apache.spark.sql.sources
abstract class Filter {
// only required methods that have no implementation
// the others follow
def references: Array[String]
}
Data Source Filter Predicate (For Filter Pushdown)
Filter is the contract for filter predicates that can be pushed down to a relation (aka data source).
Filter is used when:
-
(Data Source API V1)
BaseRelationis requested for unhandled filter predicates (and henceBaseRelationimplementations, i.e. JDBCRelation) -
(Data Source API V1)
PrunedFilteredScanis requested for build a scan (and hencePrunedFilteredScanimplementations, i.e. JDBCRelation) -
FileFormatis requested to buildReader (and henceFileFormatimplementations, i.e. OrcFileFormat, CSVFileFormat, JsonFileFormat, TextFileFormat and Spark MLlib’sLibSVMFileFormat) -
FileFormatis requested to build a Data Reader with partition column values appended (and henceFileFormatimplementations, i.e. OrcFileFormat, ParquetFileFormat) -
RowDataSourceScanExecis created (for a simple text representation (in a query plan tree)) -
DataSourceStrategyexecution planning strategy is requested to pruneFilterProject (when executed for LogicalRelation logical operators with a PrunedFilteredScan or a PrunedScan) -
DataSourceStrategyexecution planning strategy is requested to selectFilters -
(Data Source API V2)
SupportsPushDownFiltersis requested to pushFilters and for pushedFilters
| Method | Description |
|---|---|
|
Used when:
|
| Filter | Description |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Finding Column References in Any Value — findReferences Method
findReferences(value: Any): Array[String]
findReferences takes the references from the value filter is it is one or returns an empty array.
|
Note
|
findReferences is used when EqualTo, EqualNullSafe, GreaterThan, GreaterThanOrEqual, LessThan, LessThanOrEqual and In filters are requested for their column references.
|