apply(
plan: LogicalPlan): LogicalPlan
RelationConversions PostHoc Logical Evaluation Rule
RelationConversions is a posthoc logical resolution rule that the Hive-specific logical analyzer uses to convert HiveTableRelations with Parquet and ORC storage formats.
|
Caution
|
FIXME Show example of a hive table, e.g. spark.table(…)
|
RelationConversions is created when the Hive-specific logical analyzer is created.
Executing Rule — apply Method
|
Note
|
apply is part of the Rule contract to execute (apply) a rule on a LogicalPlan.
|
apply traverses the input logical plan looking for InsertIntoTables (over a HiveTableRelation) or HiveTableRelation logical operators:
-
For an InsertIntoTable over a HiveTableRelation that is non-partitioned and is convertible,
applycreates a newInsertIntoTablewith theHiveTableRelationconverted to a LogicalRelation.
-
For a
HiveTableRelationlogical operator aloneapply…FIXME
Does Table Use Parquet or ORC SerDe? — isConvertible Internal Method
isConvertible(
relation: HiveTableRelation): Boolean
isConvertible is positive when the input HiveTableRelation is a parquet or ORC table (and corresponding SQL properties are enabled).
Internally, isConvertible takes the Hive SerDe of the table (from table metadata) if available or assumes no SerDe.
isConvertible is turned on when either condition holds:
-
The Hive SerDe is
parquet(aka parquet table) and spark.sql.hive.convertMetastoreParquet configuration property is enabled (which is by default) -
The Hive SerDe is
orc(aka orc table) and spark.sql.hive.convertMetastoreOrc internal configuration property is enabled (which is by default)
|
Note
|
isConvertible is used when RelationConversions is executed.
|
Converting HiveTableRelation to HadoopFsRelation — convert Internal Method
convert(
relation: HiveTableRelation): LogicalRelation
convert branches based on the SerDe of (the storage format of) the input HiveTableRelation logical operator.
For Hive tables in parquet format, convert creates options with one extra mergeSchema per spark.sql.hive.convertMetastoreParquet.mergeSchema configuration property and requests the HiveMetastoreCatalog to convert a HiveTableRelation to a LogicalRelation (with ParquetFileFormat).
For non-parquet Hive tables, convert assumes ORC format:
-
When spark.sql.orc.impl configuration property is
native(default)convertrequestsHiveMetastoreCatalogto convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation (withorg.apache.spark.sql.execution.datasources.orc.OrcFileFormatasfileFormatClass). -
Otherwise,
convertrequestsHiveMetastoreCatalogto convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation (withorg.apache.spark.sql.hive.orc.OrcFileFormatasfileFormatClass).
|
Note
|
convert uses the HiveSessionCatalog to access the HiveMetastoreCatalog.
|
|
Note
|
|