HiveMetastoreCatalog — Legacy SessionCatalog for Converting Hive Metastore Relations to Data Source Relations

HiveMetastoreCatalog is used by HiveSessionCatalog for RelationConversions logical evaluation rule.

HiveMetastoreCatalog is created when HiveSessionStateBuilder is requested for a SessionCatalog (and creates a HiveSessionCatalog).

spark sql HiveMetastoreCatalog.png
Figure 1. HiveMetastoreCatalog, HiveSessionCatalog and HiveSessionStateBuilder

Creating HiveMetastoreCatalog Instance

HiveMetastoreCatalog takes the following to be created:

HiveMetastoreCatalog initializes the internal properties.

Converting HiveTableRelation to LogicalRelation — convertToLogicalRelation Method

convertToLogicalRelation(
  relation: HiveTableRelation,
  options: Map[String, String],
  fileFormatClass: Class[_ <: FileFormat],
  fileType: String): LogicalRelation

convertToLogicalRelation branches based on whether the input HiveTableRelation is partitioned or not.

When the HiveTableRelation is partitioned, convertToLogicalRelation uses spark.sql.hive.manageFilesourcePartitions configuration property to compute the root paths. With the property enabled, the root path is simply the table location (aka locationUri). Otherwise, the root paths are the locationUri of the partitions (using the shared ExternalCatalog).

convertToLogicalRelation creates a new LogicalRelation with a HadoopFsRelation (with no bucketing specification among things) unless a LogicalRelation for the table is already in a cache.

When the HiveTableRelation is not partitioned, convertToLogicalRelation…​FIXME

In the end, convertToLogicalRelation replaces exprIds in the table relation output (schema).

Note
convertToLogicalRelation is used when RelationConversions logical evaluation rule is executed (with Hive tables in parquet as well as native and hive ORC storage formats).

inferIfNeeded Internal Method

inferIfNeeded(
  relation: HiveTableRelation,
  options: Map[String, String],
  fileFormat: FileFormat,
  fileIndexOpt: Option[FileIndex] = None): CatalogTable

inferIfNeeded…​FIXME

Note
inferIfNeeded is used when HiveMetastoreCatalog is requested to convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation.

getCached Internal Method

getCached(
  tableIdentifier: QualifiedTableName,
  pathsInMetastore: Seq[Path],
  schemaInMetastore: StructType,
  expectedFileFormat: Class[_ <: FileFormat],
  partitionSchema: Option[StructType]): Option[LogicalRelation]

getCached…​FIXME

Note
getCached is used when HiveMetastoreCatalog is requested to convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation.

Internal Properties

Name Description

catalogProxy

Used when HiveMetastoreCatalog is requested to getCached, convertToLogicalRelation

results matching ""

    No results matching ""