List<InputPartition<InternalRow>> planInputPartitions()
DataSourceReader Contract
DataSourceReader
is the abstraction of data source readers in Data Source API V2 that can plan InputPartitions and know the schema for reading.
DataSourceReader
is created to scan the data from a data source when:
-
DataSourceV2Relation
is requested to create a new reader -
ReadSupport
is requested to create a reader
DataSourceReader
is used to create StreamingDataSourceV2Relation
and DataSourceV2ScanExec physical operator
Note
|
It appears that all concrete data source readers are used in Spark Structured Streaming only. |
Method | Description |
---|---|
|
Used exclusively when |
|
Schema for reading (loading) data from a data source Used when:
|
Note
|
In other words, using the contract is as "treading on thin ice". |
DataSourceReader | Description |
---|---|
ContinuousReader |
|
MicroBatchReader |
|
|
|
|
|
|
|
|