Whole-Stage Code Generation (aka Whole-Stage CodeGen)

Review SPARK-12795 Whole stage codegen to learn about the work to support it.

Whole-Stage Code Generation (aka Whole-Stage CodeGen) fuses multiple operators (as a subtree of plans that support code generation) together into a single Java function that is aimed at improving execution performance. It collapses a query into a single optimized function that eliminates virtual function calls and leverages CPU registers for intermediate data.

Whole-Stage Code Generation is enabled by default (using spark.sql.codegen.wholeStage property).
Whole stage codegen is used by some modern massively parallel processing (MPP) databases to archive great performance. See Efficiently Compiling Efficient Query Plans for Modern Hardware (PDF).
Janino is used to compile a Java source code into a Java class.

Before a query is executed, CollapseCodegenStages physical preparation rule is used to find the plans that support codegen and collapse them together as WholeStageCodegen. It is part of the sequence of rules QueryExecution.preparations that will be applied in order to the physical plan before execution.

BenchmarkWholeStageCodegen — Performance Benchmark

BenchmarkWholeStageCodegen class provides a benchmark to measure whole stage codegen performance.

You can execute it using the command:

build/sbt 'sql/testOnly *BenchmarkWholeStageCodegen'
You need to un-ignore tests in BenchmarkWholeStageCodegen by replacing ignore with test.
$ build/sbt 'sql/testOnly *BenchmarkWholeStageCodegen'
Running benchmark: range/limit/sum
  Running case: range/limit/sum codegen=false
22:55:23.028 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  Running case: range/limit/sum codegen=true

Java HotSpot(TM) 64-Bit Server VM 1.8.0_77-b03 on Mac OS X 10.10.5
Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz

range/limit/sum:                    Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
range/limit/sum codegen=false             376 /  433       1394.5           0.7       1.0X
range/limit/sum codegen=true              332 /  388       1581.3           0.6       1.1X

[info] - range/limit/sum (10 seconds, 74 milliseconds)

results matching ""

    No results matching ""