SparkSqlAstBuilder

SparkSqlAstBuilder is an AstBuilder that converts valid Spark SQL statements into Catalyst expressions, logical plans or table identifiers (using visit callback methods).

Note
Spark SQL uses ANTLR parser generator for parsing structured text.

SparkSqlAstBuilder is created exclusively when SparkSqlParser is created (which is when SparkSession is requested for the lazily-created SessionState).

spark sql SparkSqlAstBuilder.png
Figure 1. Creating SparkSqlAstBuilder
scala> :type spark.sessionState.sqlParser
org.apache.spark.sql.catalyst.parser.ParserInterface

import org.apache.spark.sql.execution.SparkSqlParser
val sqlParser = spark.sessionState.sqlParser.asInstanceOf[SparkSqlParser]

scala> :type sqlParser.astBuilder
org.apache.spark.sql.execution.SparkSqlAstBuilder

SparkSqlAstBuilder takes a SQLConf when created.

Note

SparkSqlAstBuilder can also be temporarily created for expr standard function (to create column expressions).

val c = expr("from_json(value, schema)")
scala> :type c
org.apache.spark.sql.Column

scala> :type c.expr
org.apache.spark.sql.catalyst.expressions.Expression

scala> println(c.expr.numberedTreeString)
00 'from_json('value, 'schema)
01 :- 'value
02 +- 'schema
Table 1. SparkSqlAstBuilder’s Visit Callback Methods
Callback Method ANTLR rule / labeled alternative Spark SQL Entity

visitAnalyze

#analyze

  • AnalyzeColumnCommand logical command for ANALYZE TABLE with FOR COLUMNS clause (but no PARTITION specification)

    val sqlText = "ANALYZE TABLE t1 COMPUTE STATISTICS FOR COLUMNS id, p1"
    val plan = spark.sql(sqlText).queryExecution.logical
    import org.apache.spark.sql.execution.command.AnalyzeColumnCommand
    val cmd = plan.asInstanceOf[AnalyzeColumnCommand]
    scala> println(cmd)
    AnalyzeColumnCommand `t1`, [id, p1]
  • AnalyzePartitionCommand logical command for ANALYZE TABLE with PARTITION specification (but no FOR COLUMNS clause)

    val analyzeTable = "ANALYZE TABLE t1 PARTITION (p1, p2) COMPUTE STATISTICS"
    val plan = spark.sql(analyzeTable).queryExecution.logical
    import org.apache.spark.sql.execution.command.AnalyzePartitionCommand
    val cmd = plan.asInstanceOf[AnalyzePartitionCommand]
    scala> println(cmd)
    AnalyzePartitionCommand `t1`, Map(p1 -> None, p2 -> None), false
  • AnalyzeTableCommand logical command for ANALYZE TABLE with neither PARTITION specification nor FOR COLUMNS clause

    val sqlText = "ANALYZE TABLE t1 COMPUTE STATISTICS NOSCAN"
    val plan = spark.sql(sqlText).queryExecution.logical
    import org.apache.spark.sql.execution.command.AnalyzeTableCommand
    val cmd = plan.asInstanceOf[AnalyzeTableCommand]
    scala> println(cmd)
    AnalyzeTableCommand `t1`, false
Note

visitAnalyze supports NOSCAN identifier only (and reports a ParseException if not used).

NOSCAN is used for AnalyzePartitionCommand and AnalyzeTableCommand logical commands only.

visitBucketSpec

#bucketSpec

visitCacheTable

#cacheTable

visitCreateHiveTable

#createHiveTable

CreateTable

visitCreateTable

#createTable

visitCreateView

#createView

CreateViewCommand for CREATE VIEW AS SQL statement

CREATE [OR REPLACE] [[GLOBAL] TEMPORARY]
VIEW [IF NOT EXISTS] tableIdentifier
[identifierCommentList] [COMMENT STRING]
[PARTITIONED ON identifierList]
[TBLPROPERTIES tablePropertyList] AS query

visitCreateTempViewUsing

#createTempViewUsing

CreateTempViewUsing for CREATE TEMPORARY VIEW … USING

visitDescribeTable

#describeTable

  • DescribeColumnCommand logical command for DESCRIBE TABLE with a single column only (i.e. no PARTITION specification).

    val sqlCmd = "DESC EXTENDED t1 p1"
    val plan = spark.sql(sqlCmd).queryExecution.logical
    import org.apache.spark.sql.execution.command.DescribeColumnCommand
    val cmd = plan.asInstanceOf[DescribeColumnCommand]
    scala> println(cmd)
    DescribeColumnCommand `t1`, [p1], true
  • DescribeTableCommand logical command for all other variants of DESCRIBE TABLE (i.e. no column)

    val sqlCmd = "DESC t1"
    val plan = spark.sql(sqlCmd).queryExecution.logical
    import org.apache.spark.sql.execution.command.DescribeTableCommand
    val cmd = plan.asInstanceOf[DescribeTableCommand]
    scala> println(cmd)
    DescribeTableCommand `t1`, false

visitInsertOverwriteHiveDir

#insertOverwriteHiveDir

visitShowCreateTable

#showCreateTable

ShowCreateTableCommand logical command for SHOW CREATE TABLE SQL statement

SHOW CREATE TABLE tableIdentifier
Table 2. SparkSqlAstBuilder’s Parsing Handlers
Parsing Handler LogicalPlan Added

withRepartitionByExpression

results matching ""

    No results matching ""