doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: ExprCode): String
SerializeFromObjectExec Unary Physical Operator
SerializeFromObjectExec
is a unary physical operator (i.e. with one child physical operator) that supports Java code generation.
SerializeFromObjectExec
supports Java code generation with the doProduce, doConsume and inputRDDs methods.
SerializeFromObjectExec
is a ObjectConsumerExec.
SerializeFromObjectExec
is created exclusively when BasicOperators execution planning strategy is requested to plan a SerializeFromObject
logical operator.
SerializeFromObjectExec
uses the child physical operator when requested for the input RDDs and the outputPartitioning.
SerializeFromObjectExec
uses the serializer for the output schema attributes.
Creating SerializeFromObjectExec Instance
SerializeFromObjectExec
takes the following when created:
-
Child physical operator (that supports Java code generation)
Generating Java Source Code for Consume Path in Whole-Stage Code Generation — doConsume
Method
Note
|
doConsume is part of CodegenSupport Contract to generate the Java source code for consume path in Whole-Stage Code Generation.
|
doConsume
…FIXME
Generating Java Source Code for Produce Path in Whole-Stage Code Generation — doProduce
Method
doProduce(ctx: CodegenContext): String
Note
|
doProduce is part of CodegenSupport Contract to generate the Java source code for produce path in Whole-Stage Code Generation.
|
doProduce
…FIXME
Executing Physical Operator (Generating RDD[InternalRow]) — doExecute
Method
doExecute(): RDD[InternalRow]
Note
|
doExecute is part of SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. RDD[InternalRow] ).
|
doExecute
requests the child physical operator to execute (that triggers physical query planning and generates an RDD[InternalRow]
) and transforms it by executing the following function on internal rows per partition with index (using RDD.mapPartitionsWithIndexInternal
that creates another RDD):
-
Creates an UnsafeProjection for the serializer
-
Requests the
UnsafeProjection
to initialize (for the partition index) -
Executes the
UnsafeProjection
on all internal binary rows in the partition
Note
|
doExecute (by RDD.mapPartitionsWithIndexInternal ) adds a new MapPartitionsRDD to the RDD lineage. Use RDD.toDebugString to see the additional MapPartitionsRDD .
|