ResolveRelations Logical Resolution Rule — Resolving UnresolvedRelations With Tables in Catalog

ResolveRelations is a logical resolution rule that the logical query plan analyzer uses to resolve UnresolvedRelations (in a logical query plan), i.e.

Technically, ResolveRelations is just a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

ResolveRelations is part of Resolution fixed-point batch of rules.

// Example: InsertIntoTable with UnresolvedRelation
import org.apache.spark.sql.catalyst.dsl.plans._
val plan = table("t1").insertInto(tableName = "t2", overwrite = true)
scala> println(plan.numberedTreeString)
00 'InsertIntoTable 'UnresolvedRelation `t2`, true, false
01 +- 'UnresolvedRelation `t1`

// Register the tables so the following resolution works
sql("CREATE TABLE IF NOT EXISTS t1(id long)")
sql("CREATE TABLE IF NOT EXISTS t2(id long)")

// ResolveRelations is a Scala object of the Analyzer class
// We need an instance of the Analyzer class to access it
import spark.sessionState.analyzer.ResolveRelations
val resolvedPlan = ResolveRelations(plan)
scala> println(resolvedPlan.numberedTreeString)
00 'InsertIntoTable 'UnresolvedRelation `t2`, true, false
01 +- 'SubqueryAlias t1
02    +- 'UnresolvedCatalogRelation `default`.`t1`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe

// Example: Other uses of UnresolvedRelation
// Use a temporary view
val v1 = spark.range(1).createOrReplaceTempView("v1")
scala> spark.catalog.listTables.filter($"name" === "v1").show
+----+--------+-----------+---------+-----------+
|name|database|description|tableType|isTemporary|
+----+--------+-----------+---------+-----------+
|  v1|    null|       null|TEMPORARY|       true|
+----+--------+-----------+---------+-----------+

import org.apache.spark.sql.catalyst.dsl.expressions._
val plan = table("v1").select(star())
scala> println(plan.numberedTreeString)
00 'Project [*]
01 +- 'UnresolvedRelation `v1`

val resolvedPlan = ResolveRelations(plan)
scala> println(resolvedPlan.numberedTreeString)
00 'Project [*]
01 +- SubqueryAlias v1
02    +- Range (0, 1, step=1, splits=Some(8))

// Example
import org.apache.spark.sql.catalyst.dsl.plans._
val plan = table(db = "db1", ref = "t1")
scala> println(plan.numberedTreeString)
00 'UnresolvedRelation `db1`.`t1`

// Register the database so the following resolution works
sql("CREATE DATABASE IF NOT EXISTS db1")

val resolvedPlan = ResolveRelations(plan)
scala> println(resolvedPlan.numberedTreeString)
00 'SubqueryAlias t1
01 +- 'UnresolvedCatalogRelation `db1`.`t1`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe

Applying ResolveRelations to Logical Plan — apply Method

apply(
  plan: LogicalPlan): LogicalPlan
Note
apply is part of Rule Contract to execute a rule on a logical plan.

For a InsertIntoTable logical operator with a UnresolvedRelation child operator, apply lookupTableFromCatalog and executes the EliminateSubqueryAliases optimization rule.

For a View operator, apply substitutes the resolved table for the InsertIntoTable operator (that will be no longer a UnresolvedRelation next time the rule is executed). For View operator, apply fail analysis with the exception:

Inserting into a view is not allowed. View: [identifier].

For UnresolvedRelation logical operators, apply simply resolveRelation.

Resolving Relation — resolveRelation Method

resolveRelation(
  plan: LogicalPlan): LogicalPlan

resolveRelation…​FIXME

Note
resolveRelation is used when ResolveRelations rule is executed (for a UnresolvedRelation logical operator).

isRunningDirectlyOnFiles Internal Method

isRunningDirectlyOnFiles(table: TableIdentifier): Boolean

isRunningDirectlyOnFiles is enabled (i.e. true) when all of the following conditions hold:

Note
isRunningDirectlyOnFiles is used exclusively when ResolveRelations resolves a relation (as a UnresolvedRelation leaf logical operator for a table reference).

Finding Table in Session-Scoped Catalog of Relational Entities — lookupTableFromCatalog Internal Method

lookupTableFromCatalog(
  u: UnresolvedRelation,
  defaultDatabase: Option[String] = None): LogicalPlan

lookupTableFromCatalog simply requests SessionCatalog to find the table in relational catalogs.

Note
lookupTableFromCatalog requests Analyzer for the current SessionCatalog.
Note
The table is described using TableIdentifier of the input UnresolvedRelation.

lookupTableFromCatalog fails the analysis phase (by reporting a AnalysisException) when the table or the table’s database cannot be found.

Note
lookupTableFromCatalog is used when ResolveRelations is executed (for InsertIntoTable with UnresolvedRelation operators) or resolves a relation (for "standalone" UnresolvedRelations).

results matching ""

    No results matching ""