List<InputPartition<InternalRow>> planInputPartitions()
DataSourceReader Contract
DataSourceReader is the abstraction of data source readers in Data Source API V2 that can plan InputPartitions and know the schema for reading.
DataSourceReader is created to scan the data from a data source when:
-
DataSourceV2Relationis requested to create a new reader -
ReadSupportis requested to create a reader
DataSourceReader is used to create StreamingDataSourceV2Relation and DataSourceV2ScanExec physical operator
|
Note
|
It appears that all concrete data source readers are used in Spark Structured Streaming only. |
| Method | Description |
|---|---|
|
Used exclusively when |
|
Schema for reading (loading) data from a data source Used when:
|
|
Note
|
In other words, using the contract is as "treading on thin ice". |
| DataSourceReader | Description |
|---|---|
ContinuousReader |
|
MicroBatchReader |
|
|
|
|
|
|
|
|