apply(
plan: LogicalPlan): LogicalPlan
RelationConversions PostHoc Logical Evaluation Rule
RelationConversions
is a posthoc logical resolution rule that the Hive-specific logical analyzer uses to convert HiveTableRelations with Parquet and ORC storage formats.
Caution
|
FIXME Show example of a hive table, e.g. spark.table(…)
|
RelationConversions
is created when the Hive-specific logical analyzer is created.
Executing Rule — apply
Method
Note
|
apply is part of the Rule contract to execute (apply) a rule on a LogicalPlan.
|
apply
traverses the input logical plan looking for InsertIntoTables (over a HiveTableRelation) or HiveTableRelation logical operators:
-
For an InsertIntoTable over a HiveTableRelation that is non-partitioned and is convertible,
apply
creates a newInsertIntoTable
with theHiveTableRelation
converted to a LogicalRelation.
-
For a
HiveTableRelation
logical operator aloneapply
…FIXME
Does Table Use Parquet or ORC SerDe? — isConvertible
Internal Method
isConvertible(
relation: HiveTableRelation): Boolean
isConvertible
is positive when the input HiveTableRelation is a parquet or ORC table (and corresponding SQL properties are enabled).
Internally, isConvertible
takes the Hive SerDe of the table (from table metadata) if available or assumes no SerDe.
isConvertible
is turned on when either condition holds:
-
The Hive SerDe is
parquet
(aka parquet table) and spark.sql.hive.convertMetastoreParquet configuration property is enabled (which is by default) -
The Hive SerDe is
orc
(aka orc table) and spark.sql.hive.convertMetastoreOrc internal configuration property is enabled (which is by default)
Note
|
isConvertible is used when RelationConversions is executed.
|
Converting HiveTableRelation to HadoopFsRelation — convert
Internal Method
convert(
relation: HiveTableRelation): LogicalRelation
convert
branches based on the SerDe of (the storage format of) the input HiveTableRelation logical operator.
For Hive tables in parquet format, convert
creates options with one extra mergeSchema
per spark.sql.hive.convertMetastoreParquet.mergeSchema configuration property and requests the HiveMetastoreCatalog to convert a HiveTableRelation to a LogicalRelation (with ParquetFileFormat).
For non-parquet
Hive tables, convert
assumes ORC format:
-
When spark.sql.orc.impl configuration property is
native
(default)convert
requestsHiveMetastoreCatalog
to convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation (withorg.apache.spark.sql.execution.datasources.orc.OrcFileFormat
asfileFormatClass
). -
Otherwise,
convert
requestsHiveMetastoreCatalog
to convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation (withorg.apache.spark.sql.hive.orc.OrcFileFormat
asfileFormatClass
).
Note
|
convert uses the HiveSessionCatalog to access the HiveMetastoreCatalog.
|
Note
|
|