April 6 & 7
Learn about Vault, Consul, & more at HashiDays Sydney in Australia Register Now

Run Analytical Workloads with Nomad Spark Integration

Use Dynamic Executors for Spark Jobs

By default, the Spark application will use a fixed number of executors. Setting spark.dynamicAllocation to true enables Spark to add and remove executors during execution depending on the number of Spark tasks scheduled to run. As described in Dynamic Resource Allocation, dynamic allocation requires that spark.shuffle.service.enabled be set to true.

On Nomad, this adds an additional shuffle service task to the executor task group. This results in a one-to-one mapping of executors to shuffle services.

When the executor exits, the shuffle service continues running so that it can serve any results produced by the executor. Due to the nature of resource allocation in Nomad, the resources allocated to the executor tasks are not freed until the shuffle service (and the application) has finished.

Next steps

Learn how to integrate Spark with HDFS.