Catalyst — Tree Manipulation Framework

Catalyst is an implementation-agnostic framework to represent and manipulate a dataflow graph, i.e. trees of relational operators and expressions.

The Catalyst framework were first introduced in SPARK-1251 Support for optimizing and executing structured queries and became part of Apache Spark on 20/Mar/14 19:12.

Spark 2.0 uses the Catalyst tree manipulation library to build an extensible query plan optimizer with a number of query optimizations.

Catalyst supports both rule-based and cost-based optimization.

results matching ""

    No results matching ""