UnsafeProjection — Generic Function to Encode InternalRows to UnsafeRows

UnsafeProjection is a Projection function that encodes InternalRows as UnsafeRows.

UnsafeProjection: InternalRow =[apply]=> UnsafeRow
Note

Spark SQL uses UnsafeProjection factory object to create concrete adhoc UnsafeProjection instances.

The base UnsafeProjection has no concrete named implementations and create factory methods delegate all calls to GenerateUnsafeProjection.generate in the end.

Creating UnsafeProjection — create Factory Method

create(schema: StructType): UnsafeProjection      (1)
create(fields: Array[DataType]): UnsafeProjection (2)
create(expr: Expression): UnsafeProjection        (3)
create(exprs: Seq[Expression], inputSchema: Seq[Attribute]): UnsafeProjection (4)
create(exprs: Seq[Expression]): UnsafeProjection  (5)
create(
  exprs: Seq[Expression],
  inputSchema: Seq[Attribute],
  subexpressionEliminationEnabled: Boolean): UnsafeProjection
  1. create takes the DataTypes from schema and calls the 2nd create

  2. create creates a BoundReference per field in fields and calls the 5th create

  3. create calls the 5th create

  4. create calls the 5th create

  5. The main create that does the heavy work

create transforms all CreateNamedStruct expressions to CreateNamedStructUnsafe in every BoundReference in the input exprs.

In the end, create requests GenerateUnsafeProjection to generate a UnsafeProjection.

Note
A variant of create takes subexpressionEliminationEnabled flag (that usually is subexpressionEliminationEnabled flag of SparkPlan).

results matching ""

    No results matching ""