SerializeFromObjectExec Unary Physical Operator

SerializeFromObjectExec is a unary physical operator (i.e. with one child physical operator) that supports Java code generation.

SerializeFromObjectExec supports Java code generation with the doProduce, doConsume and inputRDDs methods.

SerializeFromObjectExec is a ObjectConsumerExec.

SerializeFromObjectExec is created exclusively when BasicOperators execution planning strategy is requested to plan a SerializeFromObject logical operator.

SerializeFromObjectExec uses the child physical operator when requested for the input RDDs and the outputPartitioning.

SerializeFromObjectExec uses the serializer for the output schema attributes.

Creating SerializeFromObjectExec Instance

SerializeFromObjectExec takes the following when created:

Generating Java Source Code for Consume Path in Whole-Stage Code Generation — doConsume Method

doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: ExprCode): String
Note
doConsume is part of CodegenSupport Contract to generate the Java source code for consume path in Whole-Stage Code Generation.

doConsume…​FIXME

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — doProduce Method

doProduce(ctx: CodegenContext): String
Note
doProduce is part of CodegenSupport Contract to generate the Java source code for produce path in Whole-Stage Code Generation.

doProduce…​FIXME

Executing Physical Operator (Generating RDD[InternalRow]) — doExecute Method

doExecute(): RDD[InternalRow]
Note
doExecute is part of SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. RDD[InternalRow]).

doExecute requests the child physical operator to execute (that triggers physical query planning and generates an RDD[InternalRow]) and transforms it by executing the following function on internal rows per partition with index (using RDD.mapPartitionsWithIndexInternal that creates another RDD):

  1. Creates an UnsafeProjection for the serializer

  2. Requests the UnsafeProjection to initialize (for the partition index)

  3. Executes the UnsafeProjection on all internal binary rows in the partition

Note
doExecute (by RDD.mapPartitionsWithIndexInternal) adds a new MapPartitionsRDD to the RDD lineage. Use RDD.toDebugString to see the additional MapPartitionsRDD.

results matching ""

    No results matching ""