makeRDDForTable(
hiveTable: HiveTable): RDD[InternalRow]
HadoopTableReader
HadoopTableReader
is a TableReader to create an HadoopRDD
for scanning partitioned or unpartitioned tables stored in Hadoop.
HadoopTableReader
is used by HiveTableScanExec physical operator when requested to execute.
Creating HadoopTableReader Instance
HadoopTableReader
takes the following to be created:
-
Hive TableDesc
-
Hadoop Configuration
HadoopTableReader
initializes the internal properties.
makeRDDForTable
Method
Note
|
makeRDDForTable is part of the TableReader contract to…FIXME.
|
makeRDDForTable
simply calls the private makeRDDForTable with…FIXME
makeRDDForPartitionedTable
Method
makeRDDForPartitionedTable(
partitions: Seq[HivePartition]): RDD[InternalRow]
Note
|
makeRDDForPartitionedTable is part of the TableReader contract to…FIXME.
|
makeRDDForPartitionedTable
simply calls the private makeRDDForPartitionedTable with…FIXME
Creating HadoopRDD — createHadoopRdd
Internal Method
createHadoopRdd(
tableDesc: TableDesc,
path: String,
inputFormatClass: Class[InputFormat[Writable, Writable]]): RDD[Writable]
createHadoopRdd
initializeLocalJobConfFunc for the input path
and tableDesc
.
createHadoopRdd
creates an HadoopRDD
(with the broadcast Hadoop Configuration, the input inputFormatClass
, and the minimum number of partitions) and takes (maps over) the values.
Note
|
createHadoopRdd adds a HadoopRDD and a MapPartitionsRDD to a RDD lineage.
|
Note
|
createHadoopRdd is used when HadoopTableReader is requested to makeRDDForTable and makeRDDForPartitionedTable.
|
initializeLocalJobConfFunc
Utility
initializeLocalJobConfFunc(
path: String,
tableDesc: TableDesc)(
jobConf: JobConf): Unit
initializeLocalJobConfFunc
…FIXME
Note
|
initializeLocalJobConfFunc is used when HadoopTableReader is requested to create an HadoopRDD.
|
Internal Properties
Name | Description |
---|---|
|
Hadoop Configuration broadcast to executors |
|
Minimum number of partitions for a HadoopRDD:
|