val ctas = """
CREATE TABLE users
USING csv
COMMENT 'users table'
LOCATION '/tmp/users'
AS SELECT * FROM VALUES ((0, "jacek"))
"""
scala> sql(ctas)
... WARN HiveExternalCatalog: Couldn't find corresponding Hive SerDe for data source provider csv. Persisting data source table `default`.`users` into Hive metastore in Spark SQL specific format, which is NOT compatible with Hive.
val plan = sql(ctas).queryExecution.logical.numberedTreeString
org.apache.spark.sql.AnalysisException: Table default.users already exists. You need to drop it first.;
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:159)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:115)
at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:194)
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3370)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3370)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:194)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
... 49 elided
CreateDataSourceTableAsSelectCommand Logical Command
CreateDataSourceTableAsSelectCommand
is a logical command that creates a DataSource table with the data from a structured query (AS query).
Note
|
A DataSource table is a Spark SQL native table that uses any data source but Hive (per USING clause).
|
CreateDataSourceTableAsSelectCommand
is created when DataSourceAnalysis post-hoc logical resolution rule is executed (and resolves a CreateTable logical operator for a Spark table with a AS query).
Note
|
CreateDataSourceTableCommand is used instead when a CreateTable logical operator is used with no AS query. |
Creating CreateDataSourceTableAsSelectCommand Instance
CreateDataSourceTableAsSelectCommand
takes the following to be created:
-
AS query (LogicalPlan)
Executing Data-Writing Logical Command — run
Method
run(
sparkSession: SparkSession,
child: SparkPlan): Seq[Row]
Note
|
run is part of DataWritingCommand contract.
|
run
…FIXME
run
throws an AssertionError
when the tableType of the CatalogTable is VIEW
or the provider is undefined.
saveDataIntoTable
Internal Method
saveDataIntoTable(
session: SparkSession,
table: CatalogTable,
tableLocation: Option[URI],
physicalPlan: SparkPlan,
mode: SaveMode,
tableExists: Boolean): BaseRelation
saveDataIntoTable
creates a BaseRelation for…FIXME
saveDataIntoTable
…FIXME
Note
|
saveDataIntoTable is used when CreateDataSourceTableAsSelectCommand is executed.
|