leafDirToChildrenFiles: Map[Path, Array[FileStatus]]
PartitioningAwareFileIndex
PartitioningAwareFileIndex
is an extension of the FileIndex contract for indices that are aware of partitioned tables.
Method | Description |
---|---|
|
Used when |
|
Used when |
|
Partition specification (partition columns, their directories as Hadoop Paths and partition values) Used when |
PartitioningAwareFileIndex | Description |
---|---|
|
Creating PartitioningAwareFileIndex Instance
PartitioningAwareFileIndex
takes the following to be created:
-
Optional user-defined schema
PartitioningAwareFileIndex
initializes the internal properties.
Note
|
PartitioningAwareFileIndex is an abstract class and cannot be created directly. It is created indirectly for the concrete PartitioningAwareFileIndices.
|
listFiles
Method
listFiles(
partitionFilters: Seq[Expression],
dataFilters: Seq[Expression]): Seq[PartitionDirectory]
Note
|
listFiles is part of the FileIndex contract.
|
listFiles
…FIXME
partitionSchema
Method
partitionSchema: StructType
Note
|
partitionSchema is part of the FileIndex contract.
|
partitionSchema
simply returns the partition columns (as a StructType) of the partition specification.
inputFiles
Method
inputFiles: Array[String]
Note
|
inputFiles is part of the FileIndex contract.
|
inputFiles
simply returns the location of all the files.
sizeInBytes
Method
sizeInBytes: Long
Note
|
sizeInBytes is part of the FileIndex contract.
|
sizeInBytes
simply sums up the length (in bytes) of all the files.
allFiles
Method
allFiles(): Seq[FileStatus]
allFiles
…FIXME
Note
|
|
inferPartitioning
Method
inferPartitioning(): PartitionSpec
inferPartitioning
…FIXME
Note
|
inferPartitioning is used when InMemoryFileIndex and Spark Structured Streaming’s MetadataLogFileIndex are requested for the partitionSpec.
|
basePaths
Internal Method
basePaths: Set[Path]
basePaths
…FIXME
Note
|
basePaths is used when PartitioningAwareFileIndex is requested to inferPartitioning.
|
Internal Properties
Name | Description |
---|---|
|
Hadoop Configuration |