import org.apache.spark.sql.SparkSession
val spark = SparkSession
.builder
.config("spark.sql.streaming.metricsEnabled", true)
.getOrCreate
Configuration Properties
Configuration properties are used to fine-tune Spark Structured Streaming applications.
You can set them for a SparkSession
when it is created using config
method.
Tip
|
Read up on SparkSession in The Internals of Spark SQL book. |
Name | Description |
---|---|
|
Default: Supported values:
Used when StatefulAggregationStrategy execution planning strategy is executed (and plans a streaming query with an aggregate that simply boils down to creating a StateStoreRestoreExec with the proper implementation version of StreamingAggregationStateManager) Among the checkpointed properties that are not supposed to be overriden after a streaming query has once been started (and could later recover from a checkpoint after being restarted) |
|
(internal) CheckpointFileManager to use to write checkpoint files atomically Default: FileContextBasedCheckpointFileManager (with FileSystemBasedCheckpointFileManager in case of unsupported file system used for storing metadata files) |
|
|
|
|
|
|
|
(internal) A comma-separated list of fully-qualified class names of data source providers for which MicroBatchReadSupport is disabled. Reads from these sources will fall back to the V1 Sources. Default: Use SQLConf.disabledV2StreamingMicroBatchReaders to get the current value. |
|
Default: Use SQLConf.fileSourceLogCleanupDelay to get the current value. |
|
(internal) Number of log files after which all the previous files are compacted into the next log file. Default: Must be a positive value (greater than Use SQLConf.fileSourceLogCompactInterval to get the current value. |
|
Default: Use SQLConf.fileSourceLogDeletion to get the current value. |
|
(internal) State format version used to create a StateManager for FlatMapGroupsWithStateExec physical operator Default: Supported values:
Among the checkpointed properties that are not supposed to be overriden after a streaming query has once been started (and could later recover from a checkpoint after being restarted) |
|
(internal) The maximum number of batches which will be retained in memory to avoid loading from files. Default: Maximum count of versions a State Store implementation should retain in memory. The value adjusts a trade-off between memory usage vs cache miss:
Used exclusively when |
|
Flag whether Dropwizard CodaHale metrics are reported for active streaming queries Default: Use SQLConf.streamingMetricsEnabled to get the current value |
|
Default: Use SQLConf.minBatchesToRetain to get the current value |
|
Global watermark policy that is the policy to calculate the global watermark value when there are multiple watermark operators in a streaming query Default: Supported values:
Cannot be changed between query restarts from the same checkpoint location. |
|
Flag to control whether the streaming micro-batch engine should execute batches with no data to process for eager state management for stateful streaming queries ( Default: Use SQLConf.streamingNoDataMicroBatchesEnabled to get the current value |
|
(internal) How long to wait between two progress events when there is no data (in millis) when Default: Use SQLConf.streamingNoDataProgressEventInterval to get the current value |
|
Number of StreamingQueryProgresses to retain in progressBuffer internal registry when Default: Use SQLConf.streamingProgressRetention to get the current value |
|
(internal) How long (in millis) to delay Default: |
|
The initial delay and how often to execute StateStore’s maintenance task. Default: |
|
(internal) Minimum number of state store delta files that need to be generated before HDFSBackedStateStore will consider generating a snapshot (consolidate the deltas into a snapshot) Default: Use SQLConf.stateStoreMinDeltasForSnapshot to get the current value. |
|
(internal) The fully-qualified class name of the StateStoreProvider implementation that manages state data in stateful streaming queries. This class must have a zero-arg constructor. Default: HDFSBackedStateStoreProvider Use SQLConf.stateStoreProviderClass to get the current value. |
|
(internal) When enabled ( Default: |