KafkaContinuousReader — ContinuousReader for Kafka Data Source in Continuous Stream Processing

KafkaContinuousReader is created exclusively when KafkaSourceProvider is requested to create a ContinuousReader.

KafkaContinuousReader uses kafkaConsumer.pollTimeoutMs configuration parameter (default: 512) for KafkaContinuousInputPartitions when requested to planInputPartitions.

Tip	Enable `INFO` or `WARN` logging levels for `org.apache.spark.sql.kafka010.KafkaContinuousReader` to see what happens inside. Add the following line to `conf/log4j.properties`: `log4j.logger.org.apache.spark.sql.kafka010.KafkaContinuousReader=INFO` Refer to Logging.

KafkaContinuousReader takes the following to be created:

planInputPartitions(): java.util.List[InputPartition[InternalRow]]

Note	`planInputPartitions` is part of the `DataSourceReader` contract in Spark SQL for the number of `InputPartitions` to use as RDD partitions (when `DataSourceV2ScanExec` physical operator is requested for the partitions of the input RDD).

planInputPartitions…FIXME

setStartOffset(
  start: Optional[Offset]): Unit

Note	`setStartOffset` is part of the ContinuousReader Contract to…FIXME.

setStartOffset…FIXME

deserializeOffset(
  json: String): Offset

Note	`deserializeOffset` is part of the ContinuousReader Contract to…FIXME.

deserializeOffset…FIXME

mergeOffsets(
  offsets: Array[PartitionOffset]): Offset

Note	`mergeOffsets` is part of the ContinuousReader Contract to…FIXME.

mergeOffsets…FIXME

KafkaContinuousReader