CreateHiveTableAsSelectCommand Logical Command

CreateHiveTableAsSelectCommand is a logical command that writes the result of executing a structured query to a Hive table (per save mode).

CreateHiveTableAsSelectCommand uses the given CatalogTable for the table name.

CreateHiveTableAsSelectCommand is created when HiveAnalysis logical resolution rule is executed and resolves a CreateTable logical operator with a child structured query and a Hive table.

When executed, CreateHiveTableAsSelectCommand runs (morphs itself into) a InsertIntoHiveTable logical command.

assert(spark.version == "2.4.5")

val tableName = "create_hive_table_as_select_demo"
val q = sql(s"""CREATE TABLE IF NOT EXISTS $tableName USING hive SELECT 1L AS id""")
scala> q.explain(extended = true)
== Parsed Logical Plan ==
'CreateTable `create_hive_table_as_select_demo`, Ignore
+- Project [1 AS id#74L]
   +- OneRowRelation

== Analyzed Logical Plan ==
CreateHiveTableAsSelectCommand [Database:default, TableName: create_hive_table_as_select_demo, InsertIntoHiveTable]
+- Project [1 AS id#74L]
   +- OneRowRelation

== Optimized Logical Plan ==
CreateHiveTableAsSelectCommand [Database:default, TableName: create_hive_table_as_select_demo, InsertIntoHiveTable]
+- Project [1 AS id#74L]
   +- OneRowRelation

== Physical Plan ==
Execute CreateHiveTableAsSelectCommand CreateHiveTableAsSelectCommand [Database:default, TableName: create_hive_table_as_select_demo, InsertIntoHiveTable]
+- *(1) Project [1 AS id#74L]
   +- Scan OneRowRelation[]

scala> spark.table(tableName).show
+---+
| id|
+---+
|  1|
+---+

Creating CreateHiveTableAsSelectCommand Instance

CreateHiveTableAsSelectCommand takes the following to be created:

Executing Data-Writing Logical Command — run Method

run(
  sparkSession: SparkSession,
  child: SparkPlan): Seq[Row]
Note
run is part of DataWritingCommand contract.

In summary, run runs a InsertIntoHiveTable logical command.

run requests the input SparkSession for SessionState that is then requested for the SessionCatalog.

run requests the SessionCatalog to check out whether the table exists or not.

With the Hive table available, run validates the save mode and runs a InsertIntoHiveTable logical command (with overwrite and ifPartitionNotExists flags disabled).

When the Hive table is not available, run asserts that the schema (of the CatalogTable) is not defined and requests the SessionCatalog to create the table (with the ignoreIfExists flag disabled). In the end, run runs a InsertIntoHiveTable logical command (with overwrite flag enabled and ifPartitionNotExists flag disabled).

results matching ""

    No results matching ""