listFiles(
partitionFilters: Seq[Expression],
dataFilters: Seq[Expression]): Seq[PartitionDirectory]
CatalogFileIndex
-
HiveMetastoreCatalog
is requested to convert a HiveTableRelation to a LogicalRelation -
DataSource
is requested to create a BaseRelation for a FileFormat
Creating CatalogFileIndex Instance
CatalogFileIndex
takes the following to be created:
CatalogFileIndex
initializes the internal properties.
Partition Files — listFiles
Method
Note
|
listFiles is part of the FileIndex contract.
|
listFiles
lists the partitions for the input partition filters and then requests them for the underlying partition files.
inputFiles
Method
inputFiles: Array[String]
Note
|
inputFiles is part of the FileIndex contract.
|
inputFiles
lists all the partitions and then requests them for the input files.
rootPaths
Method
rootPaths: Seq[Path]
Note
|
rootPaths is part of the FileIndex contract.
|
rootPaths
simply returns the baseLocation converted to a Hadoop Path.
Listing Partitions By Given Predicate Expressions — filterPartitions
Method
filterPartitions(
filters: Seq[Expression]): InMemoryFileIndex
filterPartitions
requests the CatalogTable for the partition columns.
For a partitioned table, filterPartitions
starts tracking time. filterPartitions
requests the SessionCatalog for the partitions by filter and creates a PrunedInMemoryFileIndex (with the partition listing time).
For an unpartitioned table (no partition columns defined), filterPartitions
simply returns a InMemoryFileIndex (with the rootPaths and no user-specified schema).
Note
|
|
Internal Properties
Name | Description |
---|---|
|
Base location (as a Java URI) as defined in the CatalogTable metadata (under the locationUri of the storage) Used when |
|
Hadoop Configuration Used when |