DataWritingCommandExec Physical Operator

DataWritingCommandExec is a physical operator that is the execution environment for a DataWritingCommand logical command at execution time.

DataWritingCommandExec is created exclusively when BasicOperators execution planning strategy is requested to plan a DataWritingCommand logical command.

When requested for performance metrics, DataWritingCommandExec simply requests the DataWritingCommand for them.

Table 1. DataWritingCommandExec’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description

sideEffectResult

Collection of InternalRows (Seq[InternalRow]) that is the result of executing the DataWritingCommand (with the SparkPlan)

Used when DataWritingCommandExec is requested to executeCollect, executeToIterator, executeTake and doExecute

Creating DataWritingCommandExec Instance

DataWritingCommandExec takes the following when created:

Executing Physical Operator and Collecting Results — executeCollect Method

executeCollect(): Array[InternalRow]
Note
executeCollect is part of the SparkPlan Contract to execute the physical operator and collect results.

executeCollect…​FIXME

executeToIterator Method

executeToIterator: Iterator[InternalRow]
Note
executeToIterator is part of the SparkPlan Contract to…​FIXME.

executeToIterator…​FIXME

Taking First N UnsafeRows — executeTake Method

executeTake(limit: Int): Array[InternalRow]
Note
executeTake is part of the SparkPlan Contract to take the first n UnsafeRows.

executeTake…​FIXME

Executing Physical Operator (Generating RDD[InternalRow]) — doExecute Method

doExecute(): RDD[InternalRow]
Note
doExecute is part of the SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. RDD[InternalRow]).

doExecute simply requests the SQLContext for the SparkContext that is then requested to distribute (parallelize) the sideEffectResult (over 1 partition).

results matching ""

    No results matching ""