PrunedInMemoryFileIndex

PrunedInMemoryFileIndex is a InMemoryFileIndex for a partitioned table at an HDFS location.

PrunedInMemoryFileIndex may be given the time of the partition metadata listing.

PrunedInMemoryFileIndex is created when CatalogFileIndex is requested to filter the partitions of a partitioned table.

Tip

Enable ALL logging level for org.apache.spark.sql.execution.datasources.PrunedInMemoryFileIndex logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.sql.execution.datasources.PrunedInMemoryFileIndex=ALL

Refer to Logging.

Creating PrunedInMemoryFileIndex Instance

PrunedInMemoryFileIndex takes the following to be created:

  • SparkSession

  • Location of the Hive metastore table (as a Hadoop Path)

  • FileStatusCache

  • PartitionSpec (from a Hive metastore)

  • Optional time of the partition metadata listing

results matching ""

    No results matching ""