Batch Processing Time
Batch Processing Time (aka Batch Timeout Threshold) is the processing time (processing timestamp) of the current streaming batch.
The following standard functions (and their Catalyst expressions) allow accessing the batch processing time in Micro-Batch Stream Processing:
-
now
,current_timestamp
, andunix_timestamp
functions (CurrentTimestamp
) -
current_date
function (CurrentDate
)
Note
|
CurrentTimestamp or CurrentDate expressions are not supported in Continuous Stream Processing.
|
Internals
GroupStateImpl is given the batch processing time when created for a streaming query (that is actually the batch processing time of the FlatMapGroupsWithStateExec physical operator).
When created, FlatMapGroupsWithStateExec
physical operator has the processing time undefined and set to the current timestamp in the state preparation rule every streaming batch.
The current timestamp (and other batch-specific configurations) is given as the OffsetSeqMetadata (as part of the query planning phase) when a stream execution engine does the following:
-
MicroBatchExecution
is requested to construct a next streaming micro-batch in Micro-Batch Stream Processing -
In Continuous Stream Processing the base
StreamExecution
is requested to run stream processing and initializesOffsetSeqMetadata
to0
s.