RelationConversions Logical PostHoc Evaluation Rule — Converting Hive Tables

A Hive table is when the provider is hive in table metadata.
FIXME Show example of a hive table, e.g. spark.table(…​)

RelationConversions is created exclusively when the Hive-specific logical query plan analyzer is created.

Executing Rule — apply Method

apply(plan: LogicalPlan): LogicalPlan
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).

apply traverses the input logical plan looking for a InsertIntoTable with HiveTableRelation logical operators or HiveTableRelation logical operator alone.

For a InsertIntoTable with non-partitioned HiveTableRelation relation (that can be converted) apply converts the HiveTableRelation to a LogicalRelation.

For a HiveTableRelation logical operator alone apply…​FIXME

Creating RelationConversions Instance

RelationConversions takes the following when created:

Does Table Use Parquet or ORC SerDe? — isConvertible Internal Method

isConvertible(relation: HiveTableRelation): Boolean

isConvertible is positive when the input HiveTableRelation is a parquet or ORC table (and corresponding SQL properties are enabled).

Internally, isConvertible takes the Hive SerDe of the table (from table metadata) if available or assumes no SerDe.

isConvertible is turned on when either condition holds:

isConvertible is used when RelationConversions is executed.

Converting HiveTableRelation to LogicalRelation — convert Internal Method

convert(relation: HiveTableRelation): LogicalRelation

convert takes SerDe of (the storage of) the input HiveTableRelation and converts HiveTableRelation to LogicalRelation, i.e.

  1. For parquet serde, convert adds mergeSchema option being the value of spark.sql.hive.convertMetastoreParquet.mergeSchema configuration property (disabled by default) and requests HiveMetastoreCatalog to convertToLogicalRelation (with ParquetFileFormat as fileFormatClass).

For non-parquet serde, convert assumes ORC format.

  • When spark.sql.orc.impl configuration property is native (default) convert requests HiveMetastoreCatalog to convertToLogicalRelation (with org.apache.spark.sql.execution.datasources.orc.OrcFileFormat as fileFormatClass).

  • Otherwise, convert requests HiveMetastoreCatalog to convertToLogicalRelation (with org.apache.spark.sql.hive.orc.OrcFileFormat as fileFormatClass).

convert uses HiveSessionCatalog to access the HiveMetastoreCatalog.

convert is used when RelationConversions logical evaluation rule does the following transformations:

  • Transforms a InsertIntoTable with HiveTableRelation with a Hive table (i.e. with hive provider) that is not partitioned and uses parquet or orc data storage format

  • Transforms a HiveTableRelation with a Hive table (i.e. with hive provider) that uses parquet or orc data storage format

results matching ""

    No results matching ""