Builder — Building SparkSession using Fluent API

Builder is the fluent API to build a fully-configured SparkSession.

Table 1. Builder Methods
Method Description


Gets the current SparkSession or creates a new one.


Enables Hive support

import org.apache.spark.sql.SparkSession
val spark: SparkSession = SparkSession.builder
  .appName("My Spark Application")  // optional and will be autogenerated if not specified
  .master("local[*]")               // avoid hardcoding the deployment environment
  .enableHiveSupport()              // self-explanatory, isn't it?

You can use the fluent design pattern to set the various properties of a SparkSession that opens a session to Spark SQL.

You can have multiple SparkSessions in a single Spark application for different data catalogs (through relational entities).

getOrCreate Method


config Method


Enabling Hive Support — enableHiveSupport Method

When creating a SparkSession, you can optionally enable Hive support using enableHiveSupport method.

enableHiveSupport(): Builder

enableHiveSupport enables Hive support (with connectivity to a persistent Hive metastore, support for Hive serdes, and Hive user-defined functions).


You do not need any existing Hive installation to use Spark’s Hive support. SparkSession context will automatically create metastore_db in the current directory of a Spark application and a directory configured by spark.sql.warehouse.dir.

Refer to SharedState.

Internally, enableHiveSupport makes sure that the Hive classes are on CLASSPATH, i.e. Spark SQL’s org.apache.hadoop.hive.conf.HiveConf, and sets spark.sql.catalogImplementation property to hive.

results matching ""

    No results matching ""