SessionState — State Separation Layer Between SparkSessions

SessionState is the state separation layer between Spark SQL sessions, including SQL configuration, tables, functions, UDFs, SQL parser, and everything else that depends on a SQLConf.

SessionState is available as the sessionState property of a SparkSession.

scala> :type spark
org.apache.spark.sql.SparkSession

scala> :type spark.sessionState
org.apache.spark.sql.internal.SessionState

SessionState is created when SparkSession is requested to instantiateSessionState (when requested for the SessionState per spark.sql.catalogImplementation configuration property).

spark sql SessionState.png
Figure 1. Creating SessionState
Note

When requested for the SessionState, SparkSession uses spark.sql.catalogImplementation configuration property to load and create a BaseSessionStateBuilder that is then requested to create a SessionState instance.

There are two BaseSessionStateBuilders available:

hive catalog is set when the SparkSession was created with the Hive support enabled (using Builder.enableHiveSupport).

Table 1. SessionState’s (Lazily-Initialized) Attributes
Name Type Description

analyzer

Analyzer

Spark Analyzer

Initialized lazily (i.e. only when requested the first time) using the analyzerBuilder factory function.

Used when…​FIXME

catalog

SessionCatalog

Metastore of tables and databases

Used when…​FIXME

conf

SQLConf

FIXME

Used when…​FIXME

experimentalMethods

ExperimentalMethods

FIXME

Used when…​FIXME

functionRegistry

FunctionRegistry

FIXME

Used when…​FIXME

listenerManager

ExecutionListenerManager

FIXME

Used when…​FIXME

optimizer

Optimizer

Logical query plan optimizer

Used exclusively when QueryExecution creates an optimized logical plan.

resourceLoader

SessionResourceLoader

FIXME

Used when…​FIXME

sqlParser

ParserInterface

FIXME

Used when…​FIXME

streamingQueryManager

StreamingQueryManager

Used to manage streaming queries in Spark Structured Streaming

udfRegistration

UDFRegistration

Interface to register user-defined functions.

Used when…​FIXME

Note
SessionState is a private[sql] class and, given the package org.apache.spark.sql.internal, SessionState should be considered internal.

Creating SessionState Instance

SessionState takes the following when created:

clone Method

clone(newSparkSession: SparkSession): SessionState

clone…​FIXME

Note
clone is used when…​

"Executing" Logical Plan (Creating QueryExecution For LogicalPlan) — executePlan Method

executePlan(plan: LogicalPlan): QueryExecution

executePlan simply executes the createQueryExecution function on the input logical plan (that simply creates a QueryExecution with the current SparkSession and the input logical plan).

refreshTable Method

refreshTable(tableName: String): Unit

refreshTable…​FIXME

Note
refreshTable is used…​FIXME

Creating New Hadoop Configuration — newHadoopConf Method

newHadoopConf(): Configuration

newHadoopConf returns a Hadoop Configuration (with the SparkContext.hadoopConfiguration and all the configuration properties of the SQLConf).

Note
newHadoopConf is used by ScriptTransformation, ParquetRelation, StateStoreRDD, and SessionState itself, and few other places.

Creating New Hadoop Configuration With Extra Options — newHadoopConfWithOptions Method

newHadoopConfWithOptions(options: Map[String, String]): Configuration

newHadoopConfWithOptions creates a new Hadoop Configuration with the input options set (except path and paths options that are skipped).

Note

newHadoopConfWithOptions is used when:

results matching ""

    No results matching ""