log4j.logger.org.apache.spark.sql.hive.client.HiveClientImpl=ALL
HiveClientImpl
HiveClientImpl
is a HiveClient that uses a Hive metastore client (for meta data/DDL operations using calls to a Hive metastore).
HiveClientImpl
is created exclusively when IsolatedClientLoader
is requested to create a new Hive client. When created, HiveClientImpl
is given the location of the default database for the Hive metastore warehouse (i.e. warehouseDir that is the value of hive.metastore.warehouse.dir Hive-specific Hadoop configuration property).
Note
|
The location of the default database for the Hive metastore warehouse is /user/hive/warehouse by default.
|
Note
|
The Hadoop configuration is what HiveExternalCatalog was given when created (which is the default Hadoop configuration from Spark Core’s SparkContext.hadoopConfiguration with the Spark properties with spark.hadoop prefix).
|
Tip
|
Enable Add the following line to Refer to Logging. |
Creating HiveClientImpl Instance
HiveClientImpl
takes the following to be created:
HiveClientImpl
initializes the internal properties.
Hive Metastore Client — client
Internal Method
client: Hive
client
is a Hive metastore client (for meta data/DDL operations using calls to the metastore).
Retrieving Table Metadata From Hive Metastore — getTableOption
Method
getTableOption(
dbName: String,
tableName: String): Option[CatalogTable]
Note
|
getTableOption is part of HiveClient contract.
|
getTableOption
prints out the following DEBUG message to the logs:
Looking up [dbName].[tableName]
getTableOption
getRawTableOption and converts the Hive table metadata to Spark’s CatalogTable
renamePartitions
Method
renamePartitions(
db: String,
table: String,
specs: Seq[TablePartitionSpec],
newSpecs: Seq[TablePartitionSpec]): Unit
Note
|
renamePartitions is part of HiveClient Contract to…FIXME.
|
renamePartitions
…FIXME
alterPartitions
Method
alterPartitions(
db: String,
table: String,
newParts: Seq[CatalogTablePartition]): Unit
Note
|
alterPartitions is part of HiveClient Contract to…FIXME.
|
alterPartitions
…FIXME
getPartitions
Method
getPartitions(
table: CatalogTable,
spec: Option[TablePartitionSpec]): Seq[CatalogTablePartition]
Note
|
getPartitions is part of HiveClient Contract to…FIXME.
|
getPartitions
…FIXME
getPartitionsByFilter
Method
getPartitionsByFilter(
table: CatalogTable,
predicates: Seq[Expression]): Seq[CatalogTablePartition]
Note
|
getPartitionsByFilter is part of HiveClient Contract to…FIXME.
|
getPartitionsByFilter
…FIXME
getPartitionOption
Method
getPartitionOption(
table: CatalogTable,
spec: TablePartitionSpec): Option[CatalogTablePartition]
Note
|
getPartitionOption is part of HiveClient Contract to…FIXME.
|
getPartitionOption
…FIXME
Creating Table Statistics from Hive’s Table or Partition Parameters — readHiveStats
Internal Method
readHiveStats(properties: Map[String, String]): Option[CatalogStatistics]
readHiveStats
creates a CatalogStatistics from the input Hive table or partition parameters (if available and greater than 0).
Hive Parameter | Table Statistics |
---|---|
|
|
|
|
|
Note
|
totalSize Hive parameter has a higher precedence over rawDataSize for sizeInBytes table statistic.
|
Note
|
readHiveStats is used when HiveClientImpl is requested for the metadata of a table or table partition.
|
Retrieving Table Partition Metadata (Converting Table Partition Metadata from Hive Format to Spark SQL Format) — fromHivePartition
Method
fromHivePartition(hp: HivePartition): CatalogTablePartition
fromHivePartition
simply creates a CatalogTablePartition with the following:
-
spec from Hive’s Partition.getSpec if available
-
storage from Hive’s StorageDescriptor of the table partition
-
parameters from Hive’s Partition.getParameters if available
-
stats from Hive’s Partition.getParameters if available and converted to table statistics format
Note
|
fromHivePartition is used when HiveClientImpl is requested for getPartitionOption, getPartitions and getPartitionsByFilter.
|
Converting Native Table Metadata to Hive’s Table — toHiveTable
Method
toHiveTable(table: CatalogTable, userName: Option[String] = None): HiveTable
toHiveTable
simply creates a new Hive Table
and copies the properties from the input CatalogTable.
Note
|
|
getSparkSQLDataType
Internal Utility
getSparkSQLDataType(hc: FieldSchema): DataType
getSparkSQLDataType
…FIXME
Note
|
getSparkSQLDataType is used when…FIXME
|
Converting CatalogTablePartition to Hive Partition — toHivePartition
Utility
toHivePartition(
p: CatalogTablePartition,
ht: Table): Partition
toHivePartition
creates a Hive org.apache.hadoop.hive.ql.metadata.Partition
for the input CatalogTablePartition and the Hive org.apache.hadoop.hive.ql.metadata.Table
.
Note
|
|
Creating New HiveClientImpl — newSession
Method
newSession(): HiveClientImpl
Note
|
newSession is part of the HiveClient contract to…FIXME.
|
newSession
…FIXME
getRawTableOption
Internal Method
getRawTableOption(
dbName: String,
tableName: String): Option[Table]
getRawTableOption
requests the Hive metastore client for the Hive’s metadata of the input table.
Note
|
getRawTableOption is used when HiveClientImpl is requested to tableExists and getTableOption.
|