TreeNode — Node in Catalyst Tree

TreeNode is a node in Catalyst tree with zero or more children (and can build expression or structured query plan trees).

TreeNode offers not only functions that you may have used from Scala Collection API, e.g. map, flatMap, collect, collectFirst, foreach, but also mapChildren, transform, transformDown, transformUp, foreachUp, numberedTreeString, p, asCode, prettyJson, etc. that are particularly useful for tree manipulation or debugging.

Scala-specific, TreeNode is an abstract class that is the base class of Expression and Catalyst’s QueryPlan abstract classes.

TreeNode abstract type is a quite advanced Scala type definition (at least comparing to the other Scala types in Spark) so understanding its behaviour even outside Spark might be worthwhile by itself.

abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
  self: BaseType =>

  // ...

TreeNode Contract

package org.apache.spark.sql.catalyst.trees

abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
  self: BaseType =>

  // only required methods that have no implementation
  def children: Seq[BaseType]
  def verboseString: String
Table 1. (Subset of) TreeNode Contract (in alphabetical order)
Method Description


Child nodes


Text Representation of All Nodes in Tree — treeString Method

treeString: String  (1)
treeString(verbose: Boolean, addSuffix: Boolean = false): String
  1. Turns verbose flag on

treeString gives the string representation of all the nodes in a TreeNode.

treeString is used mainly when TreeNode is requested for the numbered text representation for display purposes (and also for the string representation of a TreeNode object).

generateTreeString Method

  depth: Int,
  lastChildren: Seq[Boolean],
  builder: StringBuilder,
  verbose: Boolean,
  prefix: String = "",
  addSuffix: Boolean = false): StringBuilder


generateTreeString is used when…​FIXME

withNewChildren Method


Simple Text Node Description — simpleString Method

simpleString: String

simpleString gives a simple one-line description of a TreeNode.

Internally, simpleString is the nodeName followed by argString separated by a single white space.

simpleString is used when TreeNode is requested for argString (of child nodes) and tree text representation (with verbose flag off).

Building Numbered Text Representation — numberedTreeString Method

numberedTreeString: String

numberedTreeString adds numbers to the text representation of all the nodes.

numberedTreeString is used primarily for interactive debugging using apply and p methods.

Getting n-th TreeNode in Tree (for Interactive Debugging) — apply Method

apply(number: Int): TreeNode[_]

apply gives number-th tree node in a tree.

apply can be used for interactive debugging.

Internally, apply gets the node at number position or null.

Getting n-th BaseType in Tree (for Interactive Debugging) — p Method

p(number: Int): BaseType

p gives number-th tree node in a tree as BaseType for interactive debugging.

p can be used for interactive debugging.

BaseType is the base type of a tree and in Spark SQL can be:

results matching ""

    No results matching ""