log4j.logger.org.apache.spark.sql.hive.client.HiveClientImpl=ALL
HiveClientImpl
HiveClientImpl is a HiveClient that uses a Hive metastore client (for meta data/DDL operations using calls to a Hive metastore).
HiveClientImpl is created exclusively when IsolatedClientLoader is requested to create a new Hive client. When created, HiveClientImpl is given the location of the default database for the Hive metastore warehouse (i.e. warehouseDir that is the value of hive.metastore.warehouse.dir Hive-specific Hadoop configuration property).
|
Note
|
The location of the default database for the Hive metastore warehouse is /user/hive/warehouse by default.
|
|
Note
|
The Hadoop configuration is what HiveExternalCatalog was given when created (which is the default Hadoop configuration from Spark Core’s SparkContext.hadoopConfiguration with the Spark properties with spark.hadoop prefix).
|
|
Tip
|
Enable Add the following line to Refer to Logging. |
Creating HiveClientImpl Instance
HiveClientImpl takes the following to be created:
HiveClientImpl initializes the internal properties.
Hive Metastore Client — client Internal Method
client: Hive
client is a Hive metastore client (for meta data/DDL operations using calls to the metastore).
Retrieving Table Metadata From Hive Metastore — getTableOption Method
getTableOption(
dbName: String,
tableName: String): Option[CatalogTable]
|
Note
|
getTableOption is part of HiveClient contract.
|
getTableOption prints out the following DEBUG message to the logs:
Looking up [dbName].[tableName]
getTableOption getRawTableOption and converts the Hive table metadata to Spark’s CatalogTable
renamePartitions Method
renamePartitions(
db: String,
table: String,
specs: Seq[TablePartitionSpec],
newSpecs: Seq[TablePartitionSpec]): Unit
|
Note
|
renamePartitions is part of HiveClient Contract to…FIXME.
|
renamePartitions…FIXME
alterPartitions Method
alterPartitions(
db: String,
table: String,
newParts: Seq[CatalogTablePartition]): Unit
|
Note
|
alterPartitions is part of HiveClient Contract to…FIXME.
|
alterPartitions…FIXME
getPartitions Method
getPartitions(
table: CatalogTable,
spec: Option[TablePartitionSpec]): Seq[CatalogTablePartition]
|
Note
|
getPartitions is part of HiveClient Contract to…FIXME.
|
getPartitions…FIXME
getPartitionsByFilter Method
getPartitionsByFilter(
table: CatalogTable,
predicates: Seq[Expression]): Seq[CatalogTablePartition]
|
Note
|
getPartitionsByFilter is part of HiveClient Contract to…FIXME.
|
getPartitionsByFilter…FIXME
getPartitionOption Method
getPartitionOption(
table: CatalogTable,
spec: TablePartitionSpec): Option[CatalogTablePartition]
|
Note
|
getPartitionOption is part of HiveClient Contract to…FIXME.
|
getPartitionOption…FIXME
Creating Table Statistics from Hive’s Table or Partition Parameters — readHiveStats Internal Method
readHiveStats(properties: Map[String, String]): Option[CatalogStatistics]
readHiveStats creates a CatalogStatistics from the input Hive table or partition parameters (if available and greater than 0).
| Hive Parameter | Table Statistics |
|---|---|
|
|
|
|
|
|
Note
|
totalSize Hive parameter has a higher precedence over rawDataSize for sizeInBytes table statistic.
|
|
Note
|
readHiveStats is used when HiveClientImpl is requested for the metadata of a table or table partition.
|
Retrieving Table Partition Metadata (Converting Table Partition Metadata from Hive Format to Spark SQL Format) — fromHivePartition Method
fromHivePartition(hp: HivePartition): CatalogTablePartition
fromHivePartition simply creates a CatalogTablePartition with the following:
-
spec from Hive’s Partition.getSpec if available
-
storage from Hive’s StorageDescriptor of the table partition
-
parameters from Hive’s Partition.getParameters if available
-
stats from Hive’s Partition.getParameters if available and converted to table statistics format
|
Note
|
fromHivePartition is used when HiveClientImpl is requested for getPartitionOption, getPartitions and getPartitionsByFilter.
|
Converting Native Table Metadata to Hive’s Table — toHiveTable Method
toHiveTable(table: CatalogTable, userName: Option[String] = None): HiveTable
toHiveTable simply creates a new Hive Table and copies the properties from the input CatalogTable.
|
Note
|
|
getSparkSQLDataType Internal Utility
getSparkSQLDataType(hc: FieldSchema): DataType
getSparkSQLDataType…FIXME
|
Note
|
getSparkSQLDataType is used when…FIXME
|
Converting CatalogTablePartition to Hive Partition — toHivePartition Utility
toHivePartition(
p: CatalogTablePartition,
ht: Table): Partition
toHivePartition creates a Hive org.apache.hadoop.hive.ql.metadata.Partition for the input CatalogTablePartition and the Hive org.apache.hadoop.hive.ql.metadata.Table.
|
Note
|
|
Creating New HiveClientImpl — newSession Method
newSession(): HiveClientImpl
|
Note
|
newSession is part of the HiveClient contract to…FIXME.
|
newSession…FIXME
getRawTableOption Internal Method
getRawTableOption(
dbName: String,
tableName: String): Option[Table]
getRawTableOption requests the Hive metastore client for the Hive’s metadata of the input table.
|
Note
|
getRawTableOption is used when HiveClientImpl is requested to tableExists and getTableOption.
|