log4j.logger.org.apache.spark.sql.kafka010.KafkaOffsetReader=DEBUG
KafkaOffsetReader
KafkaOffsetReader
is used to query a Kafka cluster for partition offsets.
KafkaOffsetReader
is created when:
-
KafkaRelation
is requested to build a distributed data scan with column pruning (as a TableScan) (to get the initial partition offsets) -
(Spark Structured Streaming)
KafkaSourceProvider
is requested tocreateSource
andcreateContinuousReader
When requested for the human-readable text representation (aka toString
), KafkaOffsetReader
simply requests the ConsumerStrategy for one.
Name | Default Value | Description |
---|---|---|
|
||
|
How long to wait before retries. |
Name | Description |
---|---|
|
Kafka’s Consumer (with keys and values of Initialized when Used when |
|
|
|
|
|
|
|
|
|
|
|
Tip
|
Enable Add the following line to Refer to Logging. |
Creating Kafka Consumer — createConsumer
Internal Method
createConsumer(): Consumer[Array[Byte], Array[Byte]]
createConsumer
requests the ConsumerStrategy to create a Kafka Consumer with driverKafkaParams and new generated group.id Kafka property.
Note
|
createConsumer is used when KafkaOffsetReader is created (and initializes consumer) and resetConsumer
|
Creating KafkaOffsetReader Instance
KafkaOffsetReader
takes the following when created:
KafkaOffsetReader
initializes the internal registries and counters.
fetchEarliestOffsets
Method
fetchEarliestOffsets(): Map[TopicPartition, Long]
fetchEarliestOffsets
…FIXME
Note
|
fetchEarliestOffsets is used when…FIXME
|
fetchEarliestOffsets
Method
fetchEarliestOffsets(newPartitions: Seq[TopicPartition]): Map[TopicPartition, Long]
fetchEarliestOffsets
…FIXME
Note
|
fetchEarliestOffsets is used when…FIXME
|
fetchLatestOffsets
Method
fetchLatestOffsets(): Map[TopicPartition, Long]
fetchLatestOffsets
…FIXME
Note
|
fetchLatestOffsets is used when…FIXME
|
Fetching (and Pausing) Assigned Kafka TopicPartitions — fetchTopicPartitions
Method
fetchTopicPartitions(): Set[TopicPartition]
fetchTopicPartitions
uses an UninterruptibleThread thread to do the following:
-
Requests the Kafka Consumer to poll (fetch data) for the topics and partitions (with
0
timeout) -
Requests the Kafka Consumer to get the set of partitions currently assigned
-
Requests the Kafka Consumer to suspend fetching from the partitions assigned
In the end, fetchTopicPartitions
returns the TopicPartitions
assigned (and paused).
Note
|
fetchTopicPartitions is used exclusively when KafkaRelation is requested to build a distributed data scan with column pruning (as a TableScan) through getPartitionOffsets.
|
nextGroupId
Internal Method
nextGroupId(): String
nextGroupId
…FIXME
Note
|
nextGroupId is used when…FIXME
|
resetConsumer
Internal Method
resetConsumer(): Unit
resetConsumer
…FIXME
Note
|
resetConsumer is used when…FIXME
|