TreeNode — Node in Catalyst Tree

TreeNode is a node in Catalyst tree with zero or more children (and can build expression or structured query plan trees).

TreeNode offers not only functions that you may have used from Scala Collection API, e.g. map, flatMap, collect, collectFirst, foreach, but also mapChildren, transform, transformDown, transformUp, foreachUp, numberedTreeString, p, asCode, prettyJson, etc. that are particularly useful for tree manipulation or debugging.

Note
Scala-specific, TreeNode is an abstract class that is the base class of Expression and Catalyst’s QueryPlan abstract classes.
Tip

TreeNode abstract type is a quite advanced Scala type definition (at least comparing to the other Scala types in Spark) so understanding its behaviour even outside Spark might be worthwhile by itself.

abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
  self: BaseType =>

  // ...
}

TreeNode Contract

package org.apache.spark.sql.catalyst.trees

abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
  self: BaseType =>

  // only required methods that have no implementation
  def children: Seq[BaseType]
  def verboseString: String
}
Table 1. (Subset of) TreeNode Contract (in alphabetical order)
Method Description

children

Child nodes

verboseString

Text Representation of All Nodes in Tree — treeString Method

treeString: String  (1)
treeString(verbose: Boolean, addSuffix: Boolean = false): String
  1. Turns verbose flag on

treeString gives the string representation of all the nodes in a TreeNode.

Note
treeString is used mainly when TreeNode is requested for the numbered text representation for display purposes (and also for the string representation of a TreeNode object).

generateTreeString Method

generateTreeString(
  depth: Int,
  lastChildren: Seq[Boolean],
  builder: StringBuilder,
  verbose: Boolean,
  prefix: String = "",
  addSuffix: Boolean = false): StringBuilder

generateTreeString…​FIXME

Note
generateTreeString is used when…​FIXME

withNewChildren Method

Caution
FIXME

Simple Text Node Description — simpleString Method

simpleString: String

simpleString gives a simple one-line description of a TreeNode.

Internally, simpleString is the nodeName followed by argString separated by a single white space.

Note
simpleString is used when TreeNode is requested for argString (of child nodes) and tree text representation (with verbose flag off).

Building Numbered Text Representation — numberedTreeString Method

numberedTreeString: String

numberedTreeString adds numbers to the text representation of all the nodes.

Note
numberedTreeString is used primarily for interactive debugging using apply and p methods.

Getting n-th TreeNode in Tree (for Interactive Debugging) — apply Method

apply(number: Int): TreeNode[_]

apply gives number-th tree node in a tree.

Note
apply can be used for interactive debugging.

Internally, apply gets the node at number position or null.

Getting n-th BaseType in Tree (for Interactive Debugging) — p Method

p(number: Int): BaseType

p gives number-th tree node in a tree as BaseType for interactive debugging.

Note
p can be used for interactive debugging.
Note

BaseType is the base type of a tree and in Spark SQL can be:

results matching ""

    No results matching ""