GroupState — State Per Group in Stateful Streaming Aggregation

GroupState is the contract for working with a state (of type S) per group for arbitrary stateful aggregation (using mapGroupsWithState or flatMapGroupsWithState operators).

Note
GroupStateImpl is the one and only implementation of GroupState available.

GroupState Contract

package org.apache.spark.sql.streaming

trait GroupState[S] extends LogicalGroupState[S] {
  def exists: Boolean
  def get: S
  def getOption: Option[S]
  def update(newState: S): Unit
  def remove(): Unit
  def hasTimedOut: Boolean
  def setTimeoutDuration(durationMs: Long): Unit
  def setTimeoutDuration(duration: String): Unit
  def setTimeoutTimestamp(timestampMs: Long): Unit
  def setTimeoutTimestamp(timestampMs: Long, additionalDuration: String): Unit
  def setTimeoutTimestamp(timestamp: java.sql.Date): Unit
  def setTimeoutTimestamp(timestamp: java.sql.Date, additionalDuration: String): Unit
}
Table 1. GroupState Contract
Method Description

exists

get

Gives the state

getOption

Gives the state as Some(…​) if available or None

update

Replaces the state with a new state (per group)

remove

hasTimedOut

results matching ""

    No results matching ""