scala> :type spark
org.apache.spark.sql.SparkSession
scala> :type spark.conf
org.apache.spark.sql.RuntimeConfig
SQLConf — Internal Configuration Store
SQLConf
is an internal key-value configuration store for parameters and hints used in Spark SQL.
Note
|
Spark SQL configuration is available through RuntimeConfig (the user-facing configuration management interface) that you can access using SparkSession. |
You can access a SQLConf
using:
-
SQLConf.get (preferred) - the
SQLConf
of the current activeSparkSession
-
SessionState - direct access through SessionState of the
SparkSession
of your choice (that gives more flexibility on whatSparkSession
is used that can be different from the current activeSparkSession
)
import org.apache.spark.sql.internal.SQLConf
// Use type-safe access to configuration properties
// using SQLConf.get.getConf
val parallelFileListingInStatsComputation = SQLConf.get.getConf(SQLConf.PARALLEL_FILE_LISTING_IN_STATS_COMPUTATION)
// or even simpler
SQLConf.get.parallelFileListingInStatsComputation
SQLConf
offers methods to get, set, unset or clear values of configuration properties, but has also the accessor methods to read the current value of a configuration property or hint.
scala> :type spark
org.apache.spark.sql.SparkSession
// Direct access to the session SQLConf
val sqlConf = spark.sessionState.conf
scala> :type sqlConf
org.apache.spark.sql.internal.SQLConf
scala> println(sqlConf.offHeapColumnVectorEnabled)
false
// Or simply import the conf value
import spark.sessionState.conf
// accessing properties through accessor methods
scala> conf.numShufflePartitions
res1: Int = 200
// Prefer SQLConf.get (over direct access)
import org.apache.spark.sql.internal.SQLConf
val cc = SQLConf.get
scala> cc == conf
res4: Boolean = true
// setting properties using aliases
import org.apache.spark.sql.internal.SQLConf.SHUFFLE_PARTITIONS
conf.setConf(SHUFFLE_PARTITIONS, 2)
scala> conf.numShufflePartitions
res2: Int = 2
// unset aka reset properties to the default value
conf.unsetConf(SHUFFLE_PARTITIONS)
scala> conf.numShufflePartitions
res3: Int = 200
Name | Parameter | Description | ||
---|---|---|---|---|
|
Used exclusively when |
|||
|
Used exclusively in JoinSelection execution planning strategy |
|||
|
|
|||
|
Used exclusively when |
|||
|
Used exclusively in BroadcastExchangeExec (for broadcasting a table to executors). |
|||
|
Used when |
|||
|
Used exclusively when |
|||
|
||||
|
|
|||
|
||||
|
|
|||
|
Used when |
|||
|
Used when |
|||
|
Used exclusively in pivot operator. |
|||
|
Used exclusively in RelationalGroupedDataset when creating the result |
|||
|
|
|||
|
Used exclusively when |
|||
|
Used when ReuseSubquery and ReuseExchange physical optimizations are executed
|
|||
|
Used exclusively when |
|||
|
|
|||
|
Used exclusively when FileSourceScanExec leaf physical operator is requested to create an RDD for non-bucketed reads |
|||
|
Used exclusively when FileSourceScanExec leaf physical operator is requested to create an RDD for non-bucketed reads |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
|
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
||||
|
||||
|
Used exclusively when |
|||
|
Used exclusively in CostBasedJoinReorder logical plan optimization |
|||
|
Used exclusively when a physical operator is requested the first n rows as an array. |
|||
|
|
|||
|
Used when |
|||
|
Used when HiveTableScanExec physical operator is executed with a partitioned table (and requested for rawPartitions) |
|||
|
Used exclusively when |
|||
|
|
|||
|
|
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
Supported values:
|
||||
|
spark.sql.statistics.parallelFileListingInStatsComputation.enabled |
Used exclusively when |
||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
|
|||
|
Used exclusively when InsertIntoHadoopFsRelationCommand logical command is executed |
|||
|
Used exclusively in JoinSelection execution planning strategy to prefer sort merge join over shuffle hash join. |
|||
|
||||
|
Used when ReplaceExceptWithFilter is executed |
|||
|
|
|||
|
||||
|
Used exclusively in ReorderJoin logical plan optimization (and indirectly in |
|||
|
|
|||
|
Used exclusively when |
|||
|
|
|||
|
Used exclusively when |
|||
|
Used when TruncateTableCommand is executed |
|||
|
||||
|
|
|||
|
Used exclusively when |
|||
|
|
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
Getting Parameters and Hints
You can get the current parameters and hints using the following family of get
methods.
getConf[T](entry: ConfigEntry[T], defaultValue: T): T
getConf[T](entry: ConfigEntry[T]): T
getConf[T](entry: OptionalConfigEntry[T]): Option[T]
getConfString(key: String): String
getConfString(key: String, defaultValue: String): String
getAllConfs: immutable.Map[String, String]
getAllDefinedConfs: Seq[(String, String, String)]
Setting Parameters and Hints
You can set parameters and hints using the following family of set
methods.
setConf(props: Properties): Unit
setConfString(key: String, value: String): Unit
setConf[T](entry: ConfigEntry[T], value: T): Unit
Unsetting Parameters and Hints
You can unset parameters and hints using the following family of unset
methods.
unsetConf(key: String): Unit
unsetConf(entry: ConfigEntry[_]): Unit
Clearing All Parameters and Hints
clear(): Unit
You can use clear
to remove all the parameters and hints in SQLConf
.
Redacting Data Source Options with Sensitive Information — redactOptions
Method
redactOptions(options: Map[String, String]): Map[String, String]
redactOptions
takes the values of the spark.sql.redaction.options.regex and spark.redaction.regex
configuration properties.
For every regular expression (in the order), redactOptions
redacts sensitive information, i.e. finds the first match of a regular expression pattern in every option key or value and if either matches replaces the value with ***(redacted)
.
Note
|
redactOptions is used exclusively when SaveIntoDataSourceCommand logical command is requested for the simple description.
|