First Aggregate Function Expression

First is a DeclarativeAggregate function expression that is created when:

val sqlText = "FIRST (organizationName IGNORE NULLS)"
val e = spark.sessionState.sqlParser.parseExpression(sqlText)
scala> :type e
org.apache.spark.sql.catalyst.expressions.Expression

import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression
val aggExpr = e.asInstanceOf[AggregateExpression]

import org.apache.spark.sql.catalyst.expressions.aggregate.First
val f = aggExpr.aggregateFunction
scala> println(f.simpleString)
first('organizationName) ignore nulls

When requested to evaluate (and return the final value), First simply returns a AttributeReference (with first name and the data type of the child expression).

Tip
Use first operator from the Catalyst DSL to create an First aggregate function expression, e.g. for testing or Spark SQL internals exploration.

Catalyst DSL — first Operator

first(e: Expression): Expression

first creates a First expression and requests it to convert to a AggregateExpression.

import org.apache.spark.sql.catalyst.dsl.expressions._
val e = first('orgName)

scala> println(e.numberedTreeString)
00 first('orgName, false)
01 +- first('orgName)()
02    :- 'orgName
03    +- false

import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression
val aggExpr = e.asInstanceOf[AggregateExpression]

import org.apache.spark.sql.catalyst.expressions.aggregate.First
val f = aggExpr.aggregateFunction
scala> println(f.simpleString)
first('orgName)()

Creating First Instance

First takes the following when created:

results matching ""

    No results matching ""