import org.apache.spark.streaming.kafka010.LocationStrategies
val preferredHosts = LocationStrategies.PreferConsistent
LocationStrategy — Preferred Hosts per Topic Partitions
LocationStrategy allows a DirectKafkaInputDStream to request Spark executors to execute Kafka consumers as close topic leaders of topic partitions as possible.
LocationStrategy is used when DirectKafkaInputDStream computes a KafkaRDD for a given batch interval and is a means of distributing processing Kafka records across Spark executors.
| Location Strategy | Description |
|---|---|
PreferBrokers |
Use when executors are on the same nodes as your Kafka brokers. |
PreferConsistent |
Use in most cases as it consistently distributes partitions across all executors. |
PreferFixed |
Use to place particular Accepts a collection of topic partition and host pairs. Any topic partition not specified uses a consistent location. |
|
Note
|
A topic partition is described using Kafka’s TopicPartition. |
You can create a LocationStrategy using LocationStrategies factory object.