EventTimeStatsAccum Accumulator — Event-Time Column Statistics for EventTimeWatermarkExec Physical Operator

EventTimeStatsAccum is a Spark accumulator that is used for the statistics of the event-time column (that EventTimeWatermarkExec physical operator uses for event-time watermark):

  • Maximum value

  • Minimum value

  • Average value

  • Number of updates (count)

EventTimeStatsAccum is created and registered exclusively for EventTimeWatermarkExec physical operator.

Note

When EventTimeWatermarkExec physical operator is requested to execute and generate a recipe for a distributed computation (as a RDD[InternalRow]), every task simply adds the values of the event-time watermark column to the EventTimeStatsAccum accumulator.

As per design of Spark accumulators in Apache Spark, accumulator updates are automatically sent out (propagated) from tasks to the driver every heartbeat and then they are accumulated together.

EventTimeStatsAccum takes a single EventTimeStats to be created (default: zero).

Accumulating Value — add Method

add(v: Long): Unit
Note
add is part of the AccumulatorV2 Contract to add (accumulate) a given value.

add simply requests the EventTimeStats to add the given v value.

Note
add is used exclusively when EventTimeWatermarkExec physical operator is requested to execute and generate a recipe for a distributed computation (as a RDD[InternalRow]).

EventTimeStats

EventTimeStats is a Scala case class for the event-time column statistics.

EventTimeStats defines a special value zero with the following values:

  • Long.MinValue for the max

  • Long.MaxValue for the min

  • 0.0 for the avg

  • 0L for the count

EventTimeStats.add Method

add(eventTime: Long): Unit

add simply updates the event-time column statistics per given eventTime.

Note
add is used exclusively when EventTimeStatsAccum is requested to accumulate the value of an event-time column.

EventTimeStats.merge Method

merge(that: EventTimeStats): Unit

merge…​FIXME

Note
merge is used when…​FIXME

results matching ""

    No results matching ""