Spark local is one of the available runtime environments in Apache Spark. It is the only available runtime with no need for a proper cluster manager (and hence many call it a pseudo-cluster, however such concept do exist in Spark and is a bit different).
local (with exactly 1 CPU core)
local[n] (with exactly
local[*] (with the total number of CPU cores that is the number of available CPU cores on the local machine)
local[n, m] (with exactly
nCPU cores and
mretries when a task fails)
local[*, m] (with the total number of CPU cores that is the number of available CPU cores on the local machine)
The default parallelism is the number of threads as specified in the master URL. This is the only mode where a driver is used for execution (as it acts both as the driver and the only executor).
The local mode is very convenient for testing, debugging or demonstration purposes as it requires no earlier setup to launch Spark applications.
This mode of operation is also called Spark in-process or (less commonly) a local version of Spark.
true when Spark runs in local mode.
scala> sc.isLocal res0: Boolean = true
scala> sc.master res0: String = local[*]
Tasks are not re-executed on failure in local mode (unless local-with-retries master URL is used).
You can run Spark in local mode using
local[n] or the most general
local[*] for the master URL.
The URL says how many threads can be used in total:
localuses 1 thread only.
local[*]uses as many threads as the number of processors available to the Java virtual machine (it uses Runtime.getRuntime.availableProcessors() to know the number).
What happens when there are less cores than
local[N, maxFailures](called local-with-retries) with
*or the number of threads to use (as explained above) and
maxFailuresbeing the value of spark.task.maxFailures configuration property.
StatusUpdate messages are received, LocalEndpoint places an offer to
If there is one or more tasks that match the offer, they are launched (using
The number of tasks to be launched is controlled by the number of threads as specified in master URL. The executor uses threads to spawn the tasks.