Exercise: Causing Stage to Fail
The example shows how Spark re-executes a stage in case of stage failure.
Start a Spark cluster, e.g. 1-node Hadoop YARN.
// 2-stage job -- it _appears_ that a stage can be failed only when there is a shuffle sc.parallelize(0 to 3e3.toInt, 2).map(n => (n % 2, n)).groupByKey.count
Use 2 executors at least so you can kill one and keep the application up and running (on one executor).
YARN_CONF_DIR=hadoop-conf ./bin/spark-shell --master yarn \ -c spark.shuffle.service.enabled=true \ --num-executors 2