import org.apache.spark.sql.execution.streaming.StreamMetadata
import org.apache.hadoop.fs.Path
val metadataPath = new Path("metadata")
scala> :type spark
org.apache.spark.sql.SparkSession
val hadoopConf = spark.sessionState.newHadoopConf()
val sm = StreamMetadata.read(metadataPath, hadoopConf)
scala> :type sm
Option[org.apache.spark.sql.execution.streaming.StreamMetadata]
StreamMetadata
StreamMetadata is a metadata associated with a StreamingQuery (indirectly through StreamExecution).
StreamMetadata takes an ID to be created.
StreamMetadata is created exclusively when StreamExecution is created (with a randomly-generated 128-bit universally unique identifier (UUID)).
StreamMetadata can be persisted to and unpersisted from a JSON file. StreamMetadata uses json4s-jackson library for JSON persistence.
Unpersisting StreamMetadata (from JSON File) — read Object Method
read(
metadataFile: Path,
hadoopConf: Configuration): Option[StreamMetadata]
read unpersists StreamMetadata from the given metadataFile file if available.
read returns a StreamMetadata if the metadata file was available and the content could be read in JSON format. Otherwise, read returns None.
|
Note
|
read uses org.json4s.jackson.Serialization.read for JSON deserialization.
|
|
Note
|
read is used exclusively when StreamExecution is created (and tries to read the metadata checkpoint file).
|
Persisting Metadata — write Object Method
write(
metadata: StreamMetadata,
metadataFile: Path,
hadoopConf: Configuration): Unit
write persists the given StreamMetadata to the given metadataFile file in JSON format.
|
Note
|
write uses org.json4s.jackson.Serialization.write for JSON serialization.
|
|
Note
|
write is used exclusively when StreamExecution is created (and the metadata checkpoint file is not available).
|