Configuration Properties

This page contains the configuration properties of the Hive data source.

Table 1. Hive-Specific Spark SQL Configuration Properties
Configuration Property


Controls whether to use the built-in ORC reader and writer for Hive tables with the ORC storage format (instead of Hive SerDe).

Default: true


Controls whether to use the built-in Parquet reader and writer for Hive tables with the parquet storage format (instead of Hive SerDe).

Default: true

Internally, this property enables RelationConversions logical rule to convert HiveTableRelations to HadoopFsRelation


Enables trying to merge possibly different but compatible Parquet schemas in different Parquet data files.

Default: false

This configuration is only effective when spark.sql.hive.convertMetastoreParquet is enabled.


Enables metastore partition management for file source tables (filesource partition management). This includes both datasource and converted Hive tables.

Default: true

When enabled (true), datasource tables store partition metadata in the Hive metastore, and use the metastore to prune partitions during query planning.

Use SQLConf.manageFilesourcePartitions method to access the current value.


Comma-separated list of class prefixes that should explicitly be reloaded for each version of Hive that Spark SQL is communicating with, e.g. Hive UDFs that are declared in a prefix that typically would be shared (i.e. org.apache.spark.*)

Default: (empty)


Location of the jars that should be used to create a HiveClientImpl.

Default: builtin

Supported locations:

  • builtin - the jars that were used to load Spark SQL (aka Spark classes). Valid only when using the execution version of Hive, i.e. spark.sql.hive.metastore.version

  • maven - download the Hive jars from Maven repositories

  • Classpath in the standard format for both Hive and Hadoop


Comma-separated list of class prefixes that should be loaded using the classloader that is shared between Spark SQL and a specific version of Hive.

Default: "com.mysql.jdbc", "org.postgresql", "", "oracle.jdbc"

An example of classes that should be shared are:

  • JDBC drivers that are needed to talk to the metastore

  • Other classes that interact with classes that are already shared, e.g. custom appenders that are used by log4j


Version of the Hive metastore (and the client classes and jars).

Default: 1.2.1


When enabled (true), check all the partition paths under the table’s root directory when reading data stored in HDFS. This configuration will be deprecated in the future releases and replaced by spark.files.ignoreMissingFiles.

Default: false


When enabled (true), some predicates will be pushed down into the Hive metastore so that unmatching partitions can be eliminated earlier.

Default: true

This only affects Hive tables that are not converted to filesource relations (based on spark.sql.hive.convertMetastoreParquet and spark.sql.hive.convertMetastoreOrc properties).

Use SQLConf.metastorePartitionPruning method to access the current value.






results matching ""

    No results matching ""