Builder — Building SparkSession using Fluent API

Builder is the fluent API to build a fully-configured SparkSession.

Table 1. Builder Methods
Method Description


Gets the current SparkSession or creates a new one.


Enables Hive support

import org.apache.spark.sql.SparkSession
val spark: SparkSession = SparkSession.builder
  .appName("My Spark Application")  // optional and will be autogenerated if not specified
  .master("local[*]")               // avoid hardcoding the deployment environment
  .enableHiveSupport()              // self-explanatory, isn't it?

You can use the fluent design pattern to set the various properties of a SparkSession that opens a session to Spark SQL.

You can have multiple SparkSessions in a single Spark application for different data catalogs (through relational entities).

getOrCreate Method


config Method


Enabling Hive Support — enableHiveSupport Method

When creating a SparkSession, you can optionally enable Hive support using enableHiveSupport method.

enableHiveSupport(): Builder

enableHiveSupport enables Hive support (with connectivity to a persistent Hive metastore, support for Hive serdes, and Hive user-defined functions).


You do not need any existing Hive installation to use Spark’s Hive support. SparkSession context will automatically create metastore_db in the current directory of a Spark application and a directory configured by spark.sql.warehouse.dir.

Refer to SharedState.

Internally, enableHiveSupport makes sure that the Hive classes are on CLASSPATH, i.e. Spark SQL’s org.apache.hadoop.hive.conf.HiveConf, and sets spark.sql.catalogImplementation property to hive.

