Schedulable Pool

Pool is a Schedulable entity that represents a tree of TaskSetManagers, i.e. it contains a collection of TaskSetManagers or the Pools thereof.

A Pool has a mandatory name, a scheduling mode, initial minShare and weight that are defined when it is created.

Note
An instance of Pool is created when TaskSchedulerImpl is initialized.
Note
The TaskScheduler Contract and Schedulable Contract both require that their entities have rootPool of type Pool.

increaseRunningTasks Method

Caution
FIXME

decreaseRunningTasks Method

Caution
FIXME

taskSetSchedulingAlgorithm Attribute

Using the scheduling mode (given when a Pool object is created), Pool selects SchedulingAlgorithm and sets taskSetSchedulingAlgorithm:

It throws an IllegalArgumentException when unsupported scheduling mode is passed on:

Unsupported spark.scheduler.mode: [schedulingMode]
Tip
Read about the scheduling modes in SchedulingMode.
Note
taskSetSchedulingAlgorithm is used in getSortedTaskSetQueue.

Getting TaskSetManagers Sorted — getSortedTaskSetQueue Method

Note
getSortedTaskSetQueue is part of the Schedulable Contract.

getSortedTaskSetQueue sorts all the Schedulables in schedulableQueue queue by a SchedulingAlgorithm (from the internal taskSetSchedulingAlgorithm).

Schedulables by Name — schedulableNameToSchedulable Registry

schedulableNameToSchedulable = new ConcurrentHashMap[String, Schedulable]

schedulableNameToSchedulable is a lookup table of Schedulable objects by their names.

Beside the obvious usage in the housekeeping methods like addSchedulable, removeSchedulable, getSchedulableByName from the Schedulable Contract, it is exclusively used in SparkContext.getPoolForName.

addSchedulable Method

Note
addSchedulable is part of the Schedulable Contract.

addSchedulable adds a Schedulable to the schedulableQueue and schedulableNameToSchedulable.

More importantly, it sets the Schedulable entity’s parent to itself.

removeSchedulable Method

Note
removeSchedulable is part of the Schedulable Contract.

removeSchedulable removes a Schedulable from the schedulableQueue and schedulableNameToSchedulable.

Note
removeSchedulable is the opposite to addSchedulable method.

SchedulingAlgorithm

SchedulingAlgorithm is the interface for a sorting algorithm to sort Schedulables.

There are currently two SchedulingAlgorithms:

FIFOSchedulingAlgorithm

FIFOSchedulingAlgorithm is a scheduling algorithm that compares Schedulables by their priority first and, when equal, by their stageId.

Note
priority and stageId are part of Schedulable Contract.
Caution
FIXME A picture is worth a thousand words. How to picture the algorithm?

FairSchedulingAlgorithm

FairSchedulingAlgorithm is a scheduling algorithm that compares Schedulables by their minShare, runningTasks, and weight.

Note
minShare, runningTasks, and weight are part of Schedulable Contract.
spark pool FairSchedulingAlgorithm.png
Figure 1. FairSchedulingAlgorithm

For each input Schedulable, minShareRatio is computed as runningTasks by minShare (but at least 1) while taskToWeightRatio is runningTasks by weight.

Finding Schedulable by Name — getSchedulableByName Method

getSchedulableByName(schedulableName: String): Schedulable
Note
getSchedulableByName is part of the Schedulable Contract to find a Schedulable by name.

getSchedulableByName…​FIXME

results matching ""

    No results matching ""