Builder — Building SparkSession using Fluent API

Builder is the fluent API to create a SparkSession.

Table 1. Builder API
Method Description

appName

appName(name: String): Builder

config

config(conf: SparkConf): Builder
config(key: String, value: Boolean): Builder
config(key: String, value: Double): Builder
config(key: String, value: Long): Builder
config(key: String, value: String): Builder

enableHiveSupport

Enables Hive support

enableHiveSupport(): Builder

getOrCreate

Gets the current SparkSession or creates a new one.

getOrCreate(): SparkSession

master

master(master: String): Builder

withExtensions

Access to the SparkSessionExtensions

withExtensions(f: SparkSessionExtensions => Unit): Builder

Builder is available using the builder object method of a SparkSession.

import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder
  .appName("My Spark Application")  // optional and will be autogenerated if not specified
  .master("local[*]")               // only for demo and testing purposes, use spark-submit instead
  .enableHiveSupport()              // self-explanatory, isn't it?
  .config("spark.sql.warehouse.dir", "target/spark-warehouse")
  .withExtensions { extensions =>
    extensions.injectResolutionRule { session =>
      ...
    }
    extensions.injectOptimizerRule { session =>
      ...
    }
  }
  .getOrCreate
Note
You can have multiple SparkSessions in a single Spark application for different data catalogs (through relational entities).
Table 2. Builder’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description

extensions

SparkSessionExtensions

Used when…​FIXME

options

Used when…​FIXME

Getting Or Creating SparkSession Instance — getOrCreate Method

getOrCreate(): SparkSession

getOrCreate…​FIXME

Enabling Hive Support — enableHiveSupport Method

enableHiveSupport(): Builder

enableHiveSupport enables Hive support, i.e. running structured queries on Hive tables (and a persistent Hive metastore, support for Hive serdes and Hive user-defined functions).

Note

You do not need any existing Hive installation to use Spark’s Hive support. SparkSession context will automatically create metastore_db in the current directory of a Spark application and a directory configured by spark.sql.warehouse.dir.

Refer to SharedState.

Internally, enableHiveSupport makes sure that the Hive classes are on CLASSPATH, i.e. Spark SQL’s org.apache.hadoop.hive.conf.HiveConf, and sets spark.sql.catalogImplementation internal configuration property to hive.

withExtensions Method

withExtensions(f: SparkSessionExtensions => Unit): Builder

withExtensions simply executes the input f function with the SparkSessionExtensions.

appName Method

appName(name: String): Builder

appName…​FIXME

config Method

config(conf: SparkConf): Builder
config(key: String, value: Boolean): Builder
config(key: String, value: Double): Builder
config(key: String, value: Long): Builder
config(key: String, value: String): Builder

config…​FIXME

master Method

master(master: String): Builder

master…​FIXME

results matching ""

    No results matching ""