KafkaOffsetReader

KafkaOffsetReader is used to query a Kafka cluster for partition offsets.

KafkaOffsetReader is created when:

When requested for the human-readable text representation (aka toString), KafkaOffsetReader simply requests the ConsumerStrategy for one.

Table 1. KafkaOffsetReader’s Options
Name Default Value Description

fetchOffset.numRetries

3

fetchOffset.retryIntervalMs

1000

How long to wait before retries.

Table 2. KafkaOffsetReader’s Internal Registries and Counters
Name Description

consumer

Kafka’s Consumer (with keys and values of Array[Byte] type)

Initialized when KafkaOffsetReader is created.

Used when KafkaOffsetReader:

execContext

groupId

kafkaReaderThread

maxOffsetFetchAttempts

nextId

offsetFetchAttemptIntervalMs

Tip

Enable INFO or DEBUG logging levels for org.apache.spark.sql.kafka010.KafkaOffsetReader to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.sql.kafka010.KafkaOffsetReader=DEBUG

Refer to Logging.

Creating Kafka Consumer — createConsumer Internal Method

createConsumer(): Consumer[Array[Byte], Array[Byte]]
Note
createConsumer is used when KafkaOffsetReader is created (and initializes consumer) and resetConsumer

Creating KafkaOffsetReader Instance

KafkaOffsetReader takes the following when created:

  • ConsumerStrategy

  • Kafka parameters (as Map[String, Object])

  • Reader options (as Map[String, String])

  • Prefix for the group id

KafkaOffsetReader initializes the internal registries and counters.

close Method

close(): Unit

close…​FIXME

Note
close is used when…​FIXME

fetchEarliestOffsets Method

fetchEarliestOffsets(): Map[TopicPartition, Long]

fetchEarliestOffsets…​FIXME

Note
fetchEarliestOffsets is used when…​FIXME

fetchEarliestOffsets Method

fetchEarliestOffsets(newPartitions: Seq[TopicPartition]): Map[TopicPartition, Long]

fetchEarliestOffsets…​FIXME

Note
fetchEarliestOffsets is used when…​FIXME

fetchLatestOffsets Method

fetchLatestOffsets(): Map[TopicPartition, Long]

fetchLatestOffsets…​FIXME

Note
fetchLatestOffsets is used when…​FIXME

Fetching (and Pausing) Assigned Kafka TopicPartitions — fetchTopicPartitions Method

fetchTopicPartitions(): Set[TopicPartition]

fetchTopicPartitions uses an UninterruptibleThread thread to do the following:

  1. Requests the Kafka Consumer to poll (fetch data) for the topics and partitions (with 0 timeout)

  2. Requests the Kafka Consumer to get the set of partitions currently assigned

  3. Requests the Kafka Consumer to suspend fetching from the partitions assigned

In the end, fetchTopicPartitions returns the TopicPartitions assigned (and paused).

Note
fetchTopicPartitions is used exclusively when KafkaRelation is requested to build a distributed data scan with column pruning (as a TableScan) through getPartitionOffsets.

nextGroupId Internal Method

nextGroupId(): String

nextGroupId…​FIXME

Note
nextGroupId is used when…​FIXME

resetConsumer Internal Method

resetConsumer(): Unit

resetConsumer…​FIXME

Note
resetConsumer is used when…​FIXME

runUninterruptibly Internal Method

runUninterruptibly[T](body: => T): T

runUninterruptibly…​FIXME

Note
runUninterruptibly is used when…​FIXME

withRetriesWithoutInterrupt Internal Method

withRetriesWithoutInterrupt(body: => Map[TopicPartition, Long]): Map[TopicPartition, Long]

withRetriesWithoutInterrupt…​FIXME

Note
withRetriesWithoutInterrupt is used when…​FIXME

results matching ""

    No results matching ""