HiveMetastoreCatalog — Legacy SessionCatalog for Converting Hive Metastore Relations to Data Source Relations

HiveMetastoreCatalog is used by HiveSessionCatalog for RelationConversions logical evaluation rule.

HiveMetastoreCatalog is created when HiveSessionStateBuilder is requested for a SessionCatalog (and creates a HiveSessionCatalog).

spark sql HiveMetastoreCatalog.png
Figure 1. HiveMetastoreCatalog, HiveSessionCatalog and HiveSessionStateBuilder

Creating HiveMetastoreCatalog Instance

HiveMetastoreCatalog takes the following to be created:

HiveMetastoreCatalog initializes the internal properties.

Converting HiveTableRelation to LogicalRelation — convertToLogicalRelation Method

  relation: HiveTableRelation,
  options: Map[String, String],
  fileFormatClass: Class[_ <: FileFormat],
  fileType: String): LogicalRelation

convertToLogicalRelation branches based on whether the input HiveTableRelation is partitioned or not.

When the HiveTableRelation is partitioned, convertToLogicalRelation uses spark.sql.hive.manageFilesourcePartitions configuration property to compute the root paths. With the property enabled, the root path is simply the table location (aka locationUri). Otherwise, the root paths are the locationUri of the partitions (using the shared ExternalCatalog).

convertToLogicalRelation creates a new LogicalRelation with a HadoopFsRelation (with no bucketing specification among things) unless a LogicalRelation for the table is already in a cache.

When the HiveTableRelation is not partitioned, convertToLogicalRelation…​FIXME

In the end, convertToLogicalRelation replaces exprIds in the table relation output (schema).

convertToLogicalRelation is used when RelationConversions logical evaluation rule is executed (with Hive tables in parquet as well as native and hive ORC storage formats).

inferIfNeeded Internal Method

  relation: HiveTableRelation,
  options: Map[String, String],
  fileFormat: FileFormat,
  fileIndexOpt: Option[FileIndex] = None): CatalogTable


inferIfNeeded is used when HiveMetastoreCatalog is requested to convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation.

getCached Internal Method

  tableIdentifier: QualifiedTableName,
  pathsInMetastore: Seq[Path],
  schemaInMetastore: StructType,
  expectedFileFormat: Class[_ <: FileFormat],
  partitionSchema: Option[StructType]): Option[LogicalRelation]


getCached is used when HiveMetastoreCatalog is requested to convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation.

Internal Properties

Name Description


Used when HiveMetastoreCatalog is requested to getCached, convertToLogicalRelation

results matching ""

    No results matching ""