SQLConf

SQLConf is a key-value configuration store for parameters and hints used in Spark SQL.

SQLConf offers methods to get, set, unset or clear their values, but has also the accessor methods to read the current value of a parameter or hint.

You can access the current SQLConf (indirectly as RuntimeConfig) using SparkSession:

val spark: SparkSession = ...
import org.apache.spark.sql.RuntimeConfig
val conf: RuntimeConfig = spark.conf
Table 1. SQLConf’s Accessor Methods (in alphabetical order)
Name Parameter / Hint Description

adaptiveExecutionEnabled

spark.sql.adaptive.enabled

Used exclusively for EnsureRequirements to add ExchangeCoordinator (when adaptive query execution is enabled)

broadcastTimeout

spark.sql.broadcastTimeout

Used exclusively in BroadcastExchangeExec (for broadcasting a table to executors).

dataFramePivotMaxValues

spark.sql.pivotMaxValues

Used exclusively in pivot operator.

dataFrameRetainGroupColumns

spark.sql.retainGroupColumns

Used exclusively in RelationalGroupedDataset when creating the result Dataset (after agg, count, mean, max, avg, min, and sum operators).

numShufflePartitions

spark.sql.shuffle.partitions

Used in:

joinReorderEnabled

spark.sql.cbo.joinReorder.enabled

Used exclusively in CostBasedJoinReorder logical plan optimization

starSchemaDetection

spark.sql.cbo.starSchemaDetection

Used exclusively in ReorderJoin logical plan optimization (and indirectly in StarSchemaDetection)

useObjectHashAggregation

spark.sql.execution.useObjectHashAggregateExec

Used exclusively in Aggregation execution planning strategy when selects a physical plan.

Table 2. Parameters and Hints (in alphabetical order)
Name Default Value Description

spark.sql.adaptive.enabled

false

Enables adaptive query execution

spark.sql.broadcastTimeout

5 * 60

Timeout in seconds for the broadcast wait time in broadcast joins.

When negative, it is assumed infinite (i.e. Duration.Inf)

Used through SQLConf.broadcastTimeout.

spark.sql.cbo.enabled

false

Enables cost-based optimization (CBO) for estimation of plan statistics when enabled (i.e. true).

Used (through SQLConf.cboEnabled method) in:

spark.sql.cbo.joinReorder.enabled

false

Enables join reorder for cost-based optimization (CBO).

Use joinReorderEnabled method to access the current value.

spark.sql.cbo.starSchemaDetection

false

Enables join reordering based on star schema detection for cost-based optimization (CBO) in ReorderJoin logical plan optimization.

Use starSchemaDetection method to access the current value.

spark.sql.execution.useObjectHashAggregateExec

true

Decides if we use ObjectHashAggregateExec (in Aggregation execution planning strategy).

Use useObjectHashAggregation method to access the current value.

spark.sql.optimizer.maxIterations

100

Maximum number of iterations for Analyzer and Optimizer.

spark.sql.pivotMaxValues

10000

Maximum number of (distinct) values that will be collected without error (when doing a pivot without specifying the values for the pivot column)

Use dataFramePivotMaxValues method to access the current value.

spark.sql.retainGroupColumns

true

Controls whether to retain columns used for aggregation or not (in RelationalGroupedDataset operators).

Use dataFrameRetainGroupColumns method to access the current value.

spark.sql.selfJoinAutoResolveAmbiguity

true

Control whether to resolve ambiguity in join conditions for self-joins automatically.

spark.sql.shuffle.partitions

200

Default number of partitions to use when shuffling data for joins or aggregations.

Corresponds to Apache Hive’s mapred.reduce.tasks property that Spark considers deprecated.

spark.sql.streaming.fileSink.log.deletion

true

Controls whether to delete the expired log files in file stream sink.

spark.sql.streaming.fileSink.log.cleanupDelay

FIXME

FIXME

spark.sql.streaming.schemaInference

FIXME

FIXME

spark.sql.streaming.fileSink.log.compactInterval

FIXME

FIXME

Note
SQLConf is a private[sql] serializable class in org.apache.spark.sql.internal package.

Getting Parameters and Hints

You can get the current parameters and hints using the following family of get methods.

getConfString(key: String): String
getConf[T](entry: ConfigEntry[T], defaultValue: T): T
getConf[T](entry: ConfigEntry[T]): T
getConf[T](entry: OptionalConfigEntry[T]): Option[T]
getConfString(key: String, defaultValue: String): String
getAllConfs: immutable.Map[String, String]
getAllDefinedConfs: Seq[(String, String, String)]

Setting Parameters and Hints

You can set parameters and hints using the following family of set methods.

setConf(props: Properties): Unit
setConfString(key: String, value: String): Unit
setConf[T](entry: ConfigEntry[T], value: T): Unit

Unsetting Parameters and Hints

You can unset parameters and hints using the following family of unset methods.

unsetConf(key: String): Unit
unsetConf(entry: ConfigEntry[_]): Unit

Clearing All Parameters and Hints

clear(): Unit

You can use clear to remove all the parameters and hints in SQLConf.

results matching ""

    No results matching ""