checkForStreaming(
plan: LogicalPlan,
outputMode: OutputMode): Unit
UnsupportedOperationChecker
UnsupportedOperationChecker checks whether the logical plan of a streaming query uses supported operations only.
|
Note
|
UnsupportedOperationChecker is used exclusively when the internal spark.sql.streaming.unsupportedOperationCheck Spark property is enabled (which is by default).
|
|
Note
|
The Spark Structured Streaming gitbook is solely focused on checkForStreaming method. |
checkForStreaming Method
checkForStreaming asserts that the following requirements hold:
checkForStreaming…FIXME
checkForStreaming finds all streaming aggregates (i.e. Aggregate logical operators with streaming sources).
|
Note
|
Aggregate logical operator represents Dataset.groupBy and Dataset.groupByKey operators (and SQL’s GROUP BY clause) in a logical query plan.
|
checkForStreaming asserts that there is exactly one streaming aggregation in a streaming query.
Otherwise, checkForStreaming reports a AnalysisException:
Multiple streaming aggregations are not supported with streaming DataFrames/Datasets
checkForStreaming asserts that watermark was defined for a streaming aggregation with Append output mode (on at least one of the grouping expressions).
Otherwise, checkForStreaming reports a AnalysisException:
Append output mode not supported when there are streaming aggregations on streaming DataFrames/DataSets without watermark
|
Caution
|
FIXME |
checkForStreaming counts all FlatMapGroupsWithState logical operators (on streaming Datasets with isMapGroupsWithState flag disabled).
|
Note
|
FlatMapGroupsWithState logical operator represents KeyValueGroupedDataset.mapGroupsWithState and KeyValueGroupedDataset.flatMapGroupsWithState operators in a logical query plan. |
|
Note
|
FlatMapGroupsWithState.isMapGroupsWithState flag is disabled when…FIXME |
checkForStreaming asserts that multiple FlatMapGroupsWithState logical operators are only used when:
-
outputModeis Append output mode -
outputMode of the
FlatMapGroupsWithStatelogical operators is also Append output mode
|
Caution
|
FIXME Reference to an example in flatMapGroupsWithState
|
Otherwise, checkForStreaming reports a AnalysisException:
Multiple flatMapGroupsWithStates are not supported when they are not all in append mode or the output mode is not append on a streaming DataFrames/Datasets
|
Caution
|
FIXME |
|
Note
|
checkForStreaming is used exclusively when StreamingQueryManager is requested to create a StreamingQueryWrapper (for starting a streaming query), but only when the internal spark.sql.streaming.unsupportedOperationCheck Spark property is enabled (which is by default).
|