KafkaContinuousReader — ContinuousReader for Kafka Data Source in Continuous Stream Processing

KafkaContinuousReader is created exclusively when KafkaSourceProvider is requested to create a ContinuousReader.

KafkaContinuousReader uses kafkaConsumer.pollTimeoutMs configuration parameter (default: 512) for KafkaContinuousInputPartitions when requested to planInputPartitions.

Tip

Enable INFO or WARN logging levels for org.apache.spark.sql.kafka010.KafkaContinuousReader to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.sql.kafka010.KafkaContinuousReader=INFO

Refer to Logging.

Creating KafkaContinuousReader Instance

KafkaContinuousReader takes the following to be created:

  • KafkaOffsetReader

  • Kafka parameters (as java.util.Map[String, Object])

  • Source options (as Map[String, String])

  • Metadata path

  • Initial offsets

  • failOnDataLoss flag

Plan Input Partitions — planInputPartitions Method

planInputPartitions(): java.util.List[InputPartition[InternalRow]]
Note
planInputPartitions is part of the DataSourceReader contract in Spark SQL for the number of InputPartitions to use as RDD partitions (when DataSourceV2ScanExec physical operator is requested for the partitions of the input RDD).

planInputPartitions…​FIXME

setStartOffset Method

setStartOffset(
  start: Optional[Offset]): Unit
Note
setStartOffset is part of the ContinuousReader Contract to…​FIXME.

setStartOffset…​FIXME

deserializeOffset Method

deserializeOffset(
  json: String): Offset
Note
deserializeOffset is part of the ContinuousReader Contract to…​FIXME.

deserializeOffset…​FIXME

mergeOffsets Method

mergeOffsets(
  offsets: Array[PartitionOffset]): Offset
Note
mergeOffsets is part of the ContinuousReader Contract to…​FIXME.

mergeOffsets…​FIXME

results matching ""

    No results matching ""