createAtomic(
path: Path,
overwriteIfPossible: Boolean): CancellableFSDataOutputStream
CheckpointFileManager Contract
CheckpointFileManager
is the abstraction of checkpoint managers that manage checkpoint files (metadata of streaming batches) on Hadoop DFS-compatible file systems.
CheckpointFileManager
is created per spark.sql.streaming.checkpointFileManagerClass configuration property if defined before reverting to the available checkpoint managers.
CheckpointFileManager
is used exclusively by HDFSMetadataLog, StreamMetadata and HDFSBackedStateStoreProvider.
Method | Description |
---|---|
|
Used when:
|
|
Deletes the given path recursively (if exists) Used when:
|
|
Used when |
|
Does not seem to be used. |
|
Lists all files in the given path Used when:
|
|
Used when:
|
|
Opens a file (by the given path) for reading Used when:
|
CheckpointFileManager | Description |
---|---|
Default |
|
Basic |
Creating CheckpointFileManager Instance — create
Object Method
create(
path: Path,
hadoopConf: Configuration): CheckpointFileManager
create
finds spark.sql.streaming.checkpointFileManagerClass configuration property in the hadoopConf
configuration.
If found, create
simply instantiates whatever CheckpointFileManager
implementation is defined.
If not found, create
creates a FileContextBasedCheckpointFileManager.
In case of UnsupportedFileSystemException
, create
prints out the following WARN message to the logs and creates (falls back on) a FileSystemBasedCheckpointFileManager.
Could not use FileContext API for managing Structured Streaming checkpoint files at [path]. Using FileSystem API instead for managing log files. If the implementation of FileSystem.rename() is not atomic, then the correctness and fault-tolerance of your Structured Streaming is not guaranteed.
Note
|
|