import org.apache.spark.sql.execution.streaming.StreamMetadata
import org.apache.hadoop.fs.Path
val metadataPath = new Path("metadata")
scala> :type spark
org.apache.spark.sql.SparkSession
val hadoopConf = spark.sessionState.newHadoopConf()
val sm = StreamMetadata.read(metadataPath, hadoopConf)
scala> :type sm
Option[org.apache.spark.sql.execution.streaming.StreamMetadata]
StreamMetadata
StreamMetadata
is a metadata associated with a StreamingQuery (indirectly through StreamExecution).
StreamMetadata
takes an ID to be created.
StreamMetadata
is created exclusively when StreamExecution is created (with a randomly-generated 128-bit universally unique identifier (UUID)).
StreamMetadata
can be persisted to and unpersisted from a JSON file. StreamMetadata
uses json4s-jackson library for JSON persistence.
Unpersisting StreamMetadata (from JSON File) — read
Object Method
read(
metadataFile: Path,
hadoopConf: Configuration): Option[StreamMetadata]
read
unpersists StreamMetadata
from the given metadataFile
file if available.
read
returns a StreamMetadata
if the metadata file was available and the content could be read in JSON format. Otherwise, read
returns None
.
Note
|
read uses org.json4s.jackson.Serialization.read for JSON deserialization.
|
Note
|
read is used exclusively when StreamExecution is created (and tries to read the metadata checkpoint file).
|
Persisting Metadata — write
Object Method
write(
metadata: StreamMetadata,
metadataFile: Path,
hadoopConf: Configuration): Unit
write
persists the given StreamMetadata
to the given metadataFile
file in JSON format.
Note
|
write uses org.json4s.jackson.Serialization.write for JSON serialization.
|
Note
|
write is used exclusively when StreamExecution is created (and the metadata checkpoint file is not available).
|