DataSourceWriter Contract

DataSourceWriter is the abstraction of data source writers in Data Source API V2 that can abort or commit a writing Spark job, create a DataWriterFactory to be shared among writing Spark tasks and optionally handle a commit message and use a CommitCoordinator for writing Spark tasks.

Note	The terms Spark job and Spark task are really about the low-level Spark jobs and tasks (that you can monitor using web UI for example).

DataSourceWriter is used to create a logical WriteToDataSourceV2 and physical WriteToDataSourceV2Exec operators.

DataSourceWriter is created when:

Table 1. DataSourceWriter Contract
Method	Description
`abort`	`void abort(WriterCommitMessage[] messages)` Aborts a writing Spark job Used exclusively when `WriteToDataSourceV2Exec` physical operator is requested to execute (and an exception was reported)
`commit`	`void commit(WriterCommitMessage[] messages)` Commits a writing Spark job Used exclusively when `WriteToDataSourceV2Exec` physical operator is requested to execute (and writing tasks all completed successfully)
`createWriterFactory`	`DataWriterFactory<InternalRow> createWriterFactory()` Creates a DataWriterFactory Used when: `WriteToDataSourceV2Exec` physical operator is requested to execute Spark Structured Streaming’s `WriteToContinuousDataSourceExec` physical operator is requested to execute Spark Structured Streaming’s `MicroBatchWriter` is requested to create a `DataWriterFactory`
`onDataWriterCommit`	`void onDataWriterCommit(WriterCommitMessage message)` Handles `WriterCommitMessage` commit message for a single successful writing Spark task Defaults to do nothing Used exclusively when `WriteToDataSourceV2Exec` physical operator is requested to execute (and runs a Spark job with partition writing tasks)
`useCommitCoordinator`	`boolean useCommitCoordinator()` Controls whether to use a Spark Core `OutputCommitCoordinator` (`true`) or not (`false`) for data writing (to make sure that at most one task for a partition commits) Default: `true` Used exclusively when `WriteToDataSourceV2Exec` physical operator is requested to execute

Table 2. DataSourceWriters (Direct Implementations and Extensions)
DataSourceWriter	Description
`MicroBatchWriter`	Used in Spark Structured Streaming only for Micro-Batch Stream Processing
`StreamWriter`	Used in Spark Structured Streaming only (to support epochs)

DataSourceWriter