listFiles(
partitionFilters: Seq[Expression],
dataFilters: Seq[Expression]): Seq[PartitionDirectory]
CatalogFileIndex
-
HiveMetastoreCatalogis requested to convert a HiveTableRelation to a LogicalRelation -
DataSourceis requested to create a BaseRelation for a FileFormat
Creating CatalogFileIndex Instance
CatalogFileIndex takes the following to be created:
CatalogFileIndex initializes the internal properties.
Partition Files — listFiles Method
|
Note
|
listFiles is part of the FileIndex contract.
|
listFiles lists the partitions for the input partition filters and then requests them for the underlying partition files.
inputFiles Method
inputFiles: Array[String]
|
Note
|
inputFiles is part of the FileIndex contract.
|
inputFiles lists all the partitions and then requests them for the input files.
rootPaths Method
rootPaths: Seq[Path]
|
Note
|
rootPaths is part of the FileIndex contract.
|
rootPaths simply returns the baseLocation converted to a Hadoop Path.
Listing Partitions By Given Predicate Expressions — filterPartitions Method
filterPartitions(
filters: Seq[Expression]): InMemoryFileIndex
filterPartitions requests the CatalogTable for the partition columns.
For a partitioned table, filterPartitions starts tracking time. filterPartitions requests the SessionCatalog for the partitions by filter and creates a PrunedInMemoryFileIndex (with the partition listing time).
For an unpartitioned table (no partition columns defined), filterPartitions simply returns a InMemoryFileIndex (with the rootPaths and no user-specified schema).
|
Note
|
|
Internal Properties
| Name | Description |
|---|---|
|
Base location (as a Java URI) as defined in the CatalogTable metadata (under the locationUri of the storage) Used when |
|
Hadoop Configuration Used when |