/* CREATE [OR REPLACE] [[GLOBAL] TEMPORARY]
VIEW [IF NOT EXISTS] tableIdentifier
[identifierCommentList] [COMMENT STRING]
[PARTITIONED ON identifierList]
[TBLPROPERTIES tablePropertyList] AS query */
// Demo table for "AS query" part
spark.range(10).write.mode("overwrite").saveAsTable("t1")
// The "AS" query
val asQuery = "SELECT * FROM t1"
// The following queries should all work fine
val q1 = "CREATE VIEW v1 AS " + asQuery
sql(q1)
val q2 = "CREATE OR REPLACE VIEW v1 AS " + asQuery
sql(q2)
val q3 = "CREATE OR REPLACE TEMPORARY VIEW v1 " + asQuery
sql(q3)
val q4 = "CREATE OR REPLACE GLOBAL TEMPORARY VIEW v1 " + asQuery
sql(q4)
val q5 = "CREATE VIEW IF NOT EXISTS v1 AS " + asQuery
sql(q5)
// The following queries should all fail
// the number of user-specified columns does not match the schema of the AS query
val qf1 = "CREATE VIEW v1 (c1 COMMENT 'comment', c2) AS " + asQuery
scala> sql(qf1)
org.apache.spark.sql.AnalysisException: The number of columns produced by the SELECT clause (num: `1`) does not match the number of column names specified by CREATE VIEW (num: `2`).;
at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:134)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3254)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3253)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:641)
... 49 elided
// CREATE VIEW ... PARTITIONED ON is not allowed
val qf2 = "CREATE VIEW v1 PARTITIONED ON (c1, c2) AS " + asQuery
scala> sql(qf2)
org.apache.spark.sql.catalyst.parser.ParseException:
Operation not allowed: CREATE VIEW ... PARTITIONED ON(line 1, pos 0)
// Use the same name of t1 for a new view
val qf3 = "CREATE VIEW t1 AS " + asQuery
scala> sql(qf3)
org.apache.spark.sql.AnalysisException: `t1` is not a view;
at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:156)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3254)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3253)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:641)
... 49 elided
// View already exists
val qf4 = "CREATE VIEW v1 AS " + asQuery
scala> sql(qf4)
org.apache.spark.sql.AnalysisException: View `v1` already exists. If you want to update the view definition, please use ALTER VIEW AS or CREATE OR REPLACE VIEW AS;
at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:169)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3254)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3253)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:641)
... 49 elided
CreateViewCommand Logical Command
CreateViewCommand is a logical command for creating or replacing a view or a table.
CreateViewCommand is created to represent the following:
-
CREATE VIEW AS SQL statements
-
Datasetoperators: Dataset.createTempView, Dataset.createOrReplaceTempView, Dataset.createGlobalTempView and Dataset.createOrReplaceGlobalTempView
|
Caution
|
FIXME What’s the difference between CreateTempViewUsing?
|
CreateViewCommand works with different view types.
| View Type | Description / Side Effect |
|---|---|
|
A session-scoped local temporary view that is available until the session, that has created it, is stopped. When executed, |
|
A cross-session global temporary view that is available until the Spark application stops. When executed, |
|
A cross-session persisted view that is available until dropped. When executed, |
CreateViewCommand returns the child logical query plan when requested for the inner nodes (that should be shown as an inner nested tree of this node).
val sqlText = "CREATE VIEW v1 AS " + asQuery
val plan = spark.sessionState.sqlParser.parsePlan(sqlText)
scala> println(plan.numberedTreeString)
00 CreateViewCommand `v1`, SELECT * FROM t1, false, false, PersistedView
01 +- 'Project [*]
02 +- 'UnresolvedRelation `t1`
Creating CatalogTable — prepareTable Internal Method
prepareTable(session: SparkSession, analyzedPlan: LogicalPlan): CatalogTable
prepareTable…FIXME
|
Note
|
prepareTable is used exclusively when CreateViewCommand logical command is executed.
|
Executing Logical Command — run Method
run(sparkSession: SparkSession): Seq[Row]
|
Note
|
run is part of RunnableCommand Contract to execute (run) a logical command.
|
run requests the input SparkSession for the SessionState that is in turn requested to execute the child logical plan (which simply creates a QueryExecution).
|
Note
|
|
run requests the input SparkSession for the SessionState that is in turn requested for the SessionCatalog.
run then branches off per the ViewType:
-
For local temporary views,
runalias the analyzed plan and requests theSessionCatalogto create or replace a local temporary view -
For global temporary views,
runalso alias the analyzed plan and requests theSessionCatalogto create or replace a global temporary view -
For persisted views,
runasks theSessionCatalogwhether the table exists or not (given TableIdentifier).-
If the table exists and the allowExisting flag is on,
runsimply does nothing (and exits) -
If the table exists and the replace flag is on,
runrequests theSessionCatalogfor the table metadata and replaces the table, i.e.runrequests theSessionCatalogto drop the table followed by re-creating it (with a new CatalogTable) -
If however the table does not exist,
runsimply requests theSessionCatalogto create it (with a new CatalogTable)
-
run throws an AnalysisException for persisted views when they already exist, the allowExisting flag is off and the table type is not a view.
[name] is not a view
run throws an AnalysisException for persisted views when they already exist and the allowExisting and replace flags are off.
View [name] already exists. If you want to update the view definition, please use ALTER VIEW AS or CREATE OR REPLACE VIEW AS
run throws an AnalysisException if the userSpecifiedColumns are defined and their numbers is different from the number of output schema attributes of the analyzed logical plan.
The number of columns produced by the SELECT clause (num: `[output.length]`) does not match the number of column names specified by CREATE VIEW (num: `[userSpecifiedColumns.length]`).
Creating CreateViewCommand Instance
CreateViewCommand takes the following when created:
-
Child logical plan
verifyTemporaryObjectsNotExists Internal Method
verifyTemporaryObjectsNotExists(sparkSession: SparkSession): Unit
verifyTemporaryObjectsNotExists…FIXME
|
Note
|
verifyTemporaryObjectsNotExists is used exclusively when CreateViewCommand logical command is executed.
|
aliasPlan Internal Method
aliasPlan(session: SparkSession, analyzedPlan: LogicalPlan): LogicalPlan
aliasPlan…FIXME
|
Note
|
aliasPlan is used when CreateViewCommand logical command is executed (and prepareTable).
|