LocalSchedulerBackend

LocalSchedulerBackend is a SchedulerBackend and an ExecutorBackend for the Spark local.

LocalSchedulerBackend is created when SparkContext is requested to create the SchedulerBackend with the TaskScheduler for the following master URLs:

While being created, LocalSchedulerBackend requests the LauncherBackend to connect.

When an executor sends task status updates (using ExecutorBackend.statusUpdate), they are passed along as StatusUpdate to LocalEndpoint.

LocalSchedulerBackend LocalEndpoint Executor task status updates.png
Figure 1. Task status updates flow in local mode

When requested for the applicationId, LocalSchedulerBackend uses local-[currentTimeMillis].

When requested for the maxNumConcurrentTasks, LocalSchedulerBackend simply divides the total number of CPU cores by spark.task.cpus configuration (default: 1).

When requested for the defaultParallelism, LocalSchedulerBackend uses spark.default.parallelism configuration (if defined) or the total number of CPU cores.

When created, LocalSchedulerBackend uses the spark.executor.extraClassPath configuration property (in the given SparkConf) for the user-defined class path for executors that is used exclusively when LocalSchedulerBackend is requested to start (and creates a LocalEndpoint that in turn uses it to create the one Executor).

LocalSchedulerBackend takes the following to be created:

Table 1. LocalSchedulerBackend’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description

localEndpoint

RpcEndpointRef to LocalSchedulerBackendEndpoint RPC endpoint (that is LocalEndpoint which LocalSchedulerBackend registers when started)

Used when LocalSchedulerBackend is requested for the following:

launcherBackend

Used when LocalSchedulerBackend is created, started and stopped

listenerBus

LiveListenerBus that is used exclusively when LocalSchedulerBackend is requested to start

Tip

Enable INFO logging level for org.apache.spark.scheduler.local.LocalSchedulerBackend logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.scheduler.local.LocalSchedulerBackend=INFO

Refer to Logging.

Starting Scheduling Backend — start Method

start(): Unit
Note
start is part of the SchedulerBackend Contract to start the scheduling backend.

start requests the SparkEnv object for the current RpcEnv.

start then creates a LocalEndpoint and requests the RpcEnv to register it as LocalSchedulerBackendEndpoint RPC endpoint.

start requests the LiveListenerBus to post a SparkListenerExecutorAdded event.

In the end, start requests the LauncherBackend to setAppId as the appId and setState as RUNNING.

reviveOffers Method

reviveOffers(): Unit
Note
reviveOffers is part of the SchedulerBackend Contract to…​FIXME.

reviveOffers…​FIXME

killTask Method

killTask(
  taskId: Long,
  executorId: String,
  interruptThread: Boolean,
  reason: String): Unit
Note
killTask is part of the SchedulerBackend Contract to kill a task.

killTask…​FIXME

statusUpdate Method

statusUpdate(
  taskId: Long,
  state: TaskState,
  data: ByteBuffer): Unit
Note
statusUpdate is part of the ExecutorBackend Contract to…​FIXME.

statusUpdate…​FIXME

Stopping Scheduling Backend — stop Method

stop(): Unit
Note
stop is part of the SchedulerBackend Contract to stop a scheduling backend.

stop…​FIXME

User-Defined Class Path for Executors — getUserClasspath Method

getUserClasspath(conf: SparkConf): Seq[URL]

getUserClasspath simply requests the given SparkConf for the spark.executor.extraClassPath configuration property and converts the entries (separated by the system-dependent path separator) to URLs.

Note
getUserClasspath is used exclusively when LocalSchedulerBackend is created.

results matching ""

    No results matching ""