StreamMetadata is a metadata associated with a StreamingQuery (indirectly through StreamExecution).

StreamMetadata takes an ID to be created.

StreamMetadata is created exclusively when StreamExecution is created (with a randomly-generated 128-bit universally unique identifier (UUID)).

StreamMetadata can be persisted to and unpersisted from a JSON file. StreamMetadata uses json4s-jackson library for JSON persistence.

import org.apache.spark.sql.execution.streaming.StreamMetadata
import org.apache.hadoop.fs.Path
val metadataPath = new Path("metadata")

scala> :type spark

val hadoopConf = spark.sessionState.newHadoopConf()
val sm =, hadoopConf)

scala> :type sm

Unpersisting StreamMetadata (from JSON File) — read Factory Method

  metadataFile: Path,
  hadoopConf: Configuration): Option[StreamMetadata]

read unpersists StreamMetadata from the given metadataFile file if available.

read returns a StreamMetadata if the metadata file was available and the content could be read in JSON format. Otherwise, read returns None.

read uses for JSON deserialization.
read is used exclusively when StreamExecution is created (and tries to read the metadata checkpoint file).

Persisting StreamMetadata (to JSON File) — write Object Method

  metadata: StreamMetadata,
  metadataFile: Path,
  hadoopConf: Configuration): Unit

write persists the given StreamMetadata to the given metadataFile file in JSON format.

write uses org.json4s.jackson.Serialization.write for JSON serialization.
write is used exclusively when StreamExecution is created (and the metadata checkpoint file is not available).

results matching ""

    No results matching ""