RelationConversions PostHoc Logical Evaluation Rule

RelationConversions is a posthoc logical resolution rule that the Hive-specific logical analyzer uses to convert HiveTableRelations with Parquet and ORC storage formats.

FIXME Show example of a hive table, e.g. spark.table(…​)

RelationConversions is created when the Hive-specific logical analyzer is created.

Creating RelationConversions Instance

RelationConversions takes the following when created:

Executing Rule — apply Method

  plan: LogicalPlan): LogicalPlan
apply is part of the Rule contract to execute (apply) a rule on a LogicalPlan.

apply traverses the input logical plan looking for InsertIntoTables (over a HiveTableRelation) or HiveTableRelation logical operators:

  • For a HiveTableRelation logical operator alone apply…​FIXME

Does Table Use Parquet or ORC SerDe? — isConvertible Internal Method

  relation: HiveTableRelation): Boolean

isConvertible is positive when the input HiveTableRelation is a parquet or ORC table (and corresponding SQL properties are enabled).

Internally, isConvertible takes the Hive SerDe of the table (from table metadata) if available or assumes no SerDe.

isConvertible is turned on when either condition holds:

isConvertible is used when RelationConversions is executed.

Converting HiveTableRelation to HadoopFsRelation — convert Internal Method

  relation: HiveTableRelation): LogicalRelation

convert branches based on the SerDe of (the storage format of) the input HiveTableRelation logical operator.

For Hive tables in parquet format, convert creates options with one extra mergeSchema per spark.sql.hive.convertMetastoreParquet.mergeSchema configuration property and requests the HiveMetastoreCatalog to convert a HiveTableRelation to a LogicalRelation (with ParquetFileFormat).

For non-parquet Hive tables, convert assumes ORC format:

convert uses the HiveSessionCatalog to access the HiveMetastoreCatalog.

convert is used when RelationConversions logical evaluation rule is executed and does the following transformations:

  • Transforms an InsertIntoTable over a HiveTableRelation with a Hive table (i.e. with hive provider) that is not partitioned and uses parquet or orc data storage format

  • Transforms an HiveTableRelation with a Hive table (i.e. with hive provider) that uses parquet or orc data storage format

results matching ""

    No results matching ""