Coalesce Expression

Coalesce is a Catalyst expression to represent coalesce standard function or SQL’s coalesce function in structured queries.

When created, Coalesce takes Catalyst expressions (as the children).

import org.apache.spark.sql.catalyst.expressions.Coalesce

// Use Catalyst DSL
import org.apache.spark.sql.catalyst.dsl.expressions._

import org.apache.spark.sql.functions.lit
val coalesceExpr = Coalesce(children = Seq(lit(null).expr % 1, lit(null).expr, 1d))
scala> println(coalesceExpr.numberedTreeString)
00 coalesce((null % 1), null, 1.0)
01 :- (null % 1)
02 :  :- null
03 :  +- 1
04 :- null
05 +- 1.0
FIXME Describe FunctionArgumentConversion and Coalesce

Spark Optimizer uses NullPropagation logical optimization to remove null literals (in the children expressions). That could result in a static evaluation that gives null value if all children expressions are null literals.

// Demo Coalesce with nulls only
// Demo Coalesce with null and non-null expressions that are optimized to one expression (in NullPropagation)
// Demo Coalesce with non-null expressions after NullPropagation optimization

Coalesce is also created when:

  • Analyzer is requested to commonNaturalJoinProcessing for FullOuter join type

  • RewriteDistinctAggregates logical optimization is requested to rewrite

  • ExtractEquiJoinKeys Scala extractor is requested to destructure a logical plan

  • ColumnStat is requested to statExprs

  • IfNull expression is created

  • Nvl expression is created

  • Whenever Cast expression is used in Catalyst expressions (e.g. Average, Sum)

results matching ""

    No results matching ""