Settings
The following list are the settings used to configure Spark Streaming applications.
Caution
|
FIXME Describe how to set them in streaming applications. |
-
spark.streaming.kafka.maxRetries
(default:1
) sets up the number of connection attempts to Kafka brokers. -
spark.streaming.receiver.writeAheadLog.enable
(default:false
) controls what ReceivedBlockHandler to use:WriteAheadLogBasedBlockHandler
orBlockManagerBasedBlockHandler
. -
spark.streaming.receiver.blockStoreTimeout
(default:30
) time in seconds to wait until both writes to a write-ahead log and BlockManager complete successfully. -
spark.streaming.clock
(default:org.apache.spark.util.SystemClock
) specifies a fully-qualified class name that extendsorg.apache.spark.util.Clock
to represent time. It is used in JobGenerator. -
spark.streaming.ui.retainedBatches
(default:1000
) controls the number ofBatchUIData
elements about completed batches in a first-in-first-out (FIFO) queue that are used to display statistics in Streaming page in web UI. -
spark.streaming.receiverRestartDelay
(default:2000
) - the time interval between a receiver is stopped and started again. -
spark.streaming.concurrentJobs
(default:1
) is the number of concurrent jobs, i.e. threads in streaming-job-executor thread pool. -
spark.streaming.stopSparkContextByDefault
(default:true
) controls whether (true
) or not (false
) to stop the underlying SparkContext (regardless of whether thisStreamingContext
has been started).
-
spark.streaming.kafka.maxRatePerPartition
(default:0
) if non-0
sets maximum number of messages per partition. -
spark.streaming.manualClock.jump
(default:0
) offsets (aka jumps) the system time, i.e. adds its value to checkpoint time, when used with the clock being a subclass oforg.apache.spark.util.ManualClock
. It is used when JobGenerator is restarted from checkpoint. -
spark.streaming.unpersist
(default:true
) is a flag to control whether output streams should unpersist old RDDs. -
spark.streaming.gracefulStopTimeout
(default: 10 * batch interval) -
spark.streaming.stopGracefullyOnShutdown
(default:false
) controls whether to stop StreamingContext gracefully or not and is used by stopOnShutdown Shutdown Hook.
Checkpointing
-
spark.streaming.checkpoint.directory
- when set and StreamingContext is created, the value of the setting gets passed on to StreamingContext.checkpoint method.
Back Pressure
-
spark.streaming.backpressure.enabled
(default:false
) - enables (true
) or disables (false
) back pressure in input streams with receivers or DirectKafkaInputDStream. -
spark.streaming.backpressure.rateEstimator
(default:pid
) is the RateEstimator to use.