sourceSchema(): SourceInfo
DataSource — Pluggable Data Provider Framework
| 
 Tip 
 | 
Read up on DataSource — Pluggable Data Provider Framework in The Internals of Spark SQL online book. | 
Creating DataSource Instance
DataSource takes the following to be created:
DataSource initializes the internal properties.
 Generating Metadata of Streaming Source (Data Source API V1) — sourceSchema Internal Method
sourceSchema creates a new instance of the data source class and branches off per the type, e.g. StreamSourceProvider, FileFormat and other types.
| 
 Note 
 | 
sourceSchema is used exclusively when DataSource is requested for the SourceInfo.
 | 
StreamSourceProvider
For a StreamSourceProvider, sourceSchema requests the StreamSourceProvider for the name and schema (of the streaming source).
In the end, sourceSchema returns the name and the schema as part of SourceInfo (with partition columns unspecified).
 Creating Streaming Source (Micro-Batch Stream Processing / Data Source API V1) — createSource Method
createSource(
  metadataPath: String): Source
createSource creates a new instance of the data source class and branches off per the type, e.g. StreamSourceProvider, FileFormat and other types.
| 
 Note 
 | 
createSource is used exclusively when MicroBatchExecution is requested to initialize the analyzed logical plan.
 | 
StreamSourceProvider
For a StreamSourceProvider, createSource requests the StreamSourceProvider to create a source.
FileFormat
For a FileFormat, createSource creates a new FileStreamSource.
createSource throws an IllegalArgumentException when path option was not specified for a FileFormat data source:
'path' is not specified
 Creating Streaming Sink — createSink Method
createSink(
  outputMode: OutputMode): Sink
createSink creates a streaming sink for StreamSinkProvider or FileFormat data sources.
| 
 Tip 
 | 
Read up on FileFormat Data Source in The Internals of Spark SQL book. | 
Internally, createSink creates a new instance of the providingClass and branches off per type:
- 
For a StreamSinkProvider,
createSinksimply delegates the call and requests it to create a streaming sink - 
For a
FileFormat,createSinkcreates a FileStreamSink whenpathoption is specified and the output mode is Append 
createSink throws a IllegalArgumentException when path option is not specified for a FileFormat data source:
'path' is not specified
createSink throws an AnalysisException when the given OutputMode is different from Append for a FileFormat data source:
Data source [className] does not support [outputMode] output mode
createSink throws an UnsupportedOperationException for unsupported data source formats:
Data source [className] does not support streamed writing
| 
 Note 
 | 
createSink is used exclusively when DataStreamWriter is requested to start a streaming query.
 | 
Internal Properties
| Name | Description | 
|---|---|
  | 
java.lang.Class for the className (that can be a fully-qualified class name or an alias of the data source)  | 
  | 
Metadata of a Source with the alias (short name), the schema, and optional partitioning columns 
 Used when: 
  |