HDFSBackedStateStore — State Store on HDFS-Compatible File System

HDFSBackedStateStore is a concrete StateStore that uses a HDFS-compatible file system for versioned state persistence.

HDFSBackedStateStore is created exclusively when HDFSBackedStateStoreProvider is requested for a state store for a given version (when StateStore helper object is requested to retrieve the StateStore for a given ID and version).

HDFSBackedStateStore can be in the following states:

  • UPDATING

  • COMMITTED

  • ABORTED

HDFSBackedStateStore uses the StateStoreId of the owning HDFSBackedStateStoreProvider.

When requested for the textual representation, HDFSBackedStateStore gives HDFSStateStore[id=(op=[operatorId],part=[partitionId]),dir=[baseDir]].

HDFSBackedStateStore takes the following to be created:

Table 1. HDFSBackedStateStore’s Internal Registries
Name Description

compressedStream

compressedStream: DataOutputStream

The compressed java.io.DataOutputStream for the deltaFileStream

deltaFileStream

deltaFileStream: CheckpointFileManager.CancellableFSDataOutputStream

finalDeltaFile

finalDeltaFile: Path

The Hadoop Path of the deltaFile for the version

newVersion

newVersion: Long

Used exclusively when HDFSBackedStateStore is requested for the finalDeltaFile, to commit and abort

state

state: STATE
Tip

HDFSBackedStateStore is an internal class of HDFSBackedStateStoreProvider and uses its logger.

writeUpdateToDeltaFile Internal Method

writeUpdateToDeltaFile(
  output: DataOutputStream,
  key: UnsafeRow,
  value: UnsafeRow): Unit
Caution
FIXME

put Method

put(key: UnsafeRow, value: UnsafeRow): Unit
Note
put is a part of StateStore Contract to…​FIXME

put stores the copies of the key and value in mapToUpdate internal registry followed by writing them to a delta file (using tempDeltaFileStream).

Note

put can only be used when HDFSBackedStateStore is in UPDATING state and reports a IllegalStateException otherwise.

Cannot put after already committed or aborted

Committing State Changes — commit Method

commit(): Long
Note
commit is part of the StateStore Contract to commit state changes.

commit firstly commitUpdates (with the newVersion, the mapToUpdate and the compressed stream).

commit sets the state to COMMITTED.

commit prints out the following INFO message to the logs:

Committed version [newVersion] for [this] to file [finalDeltaFile]

commit returns a newVersion.

commit throws a IllegalStateException when executed in any state but UPDATING state:

Cannot commit after already committed or aborted

commit throws a IllegalStateException for any NonFatal exception:

Error committing version [newVersion] into [this]

Aborting State Changes — abort Method

abort(): Unit
Note
abort is part of the StateStore Contract to abort the state changes.

abort…​FIXME

commitUpdates Internal Method

commitUpdates(newVersion: Long, map: MapType, output: DataOutputStream): Unit

commitUpdates…​FIXME

Note
commitUpdates is used exclusively when HDFSBackedStateStore is requested to commit state changes.

metrics Method

metrics: StateStoreMetrics
Note
metrics is part of the StateStore Contract to get the StateStoreMetrics.

metrics…​FIXME

results matching ""

    No results matching ""