HiveTableScanExec Leaf Physical Operator

HiveTableScanExec is a leaf physical operator that represents a HiveTableRelation logical operator at execution time.

HiveTableScanExec is created exclusively when HiveTableScans execution planning strategy plans a HiveTableRelation logical operator (i.e. is executed on a logical query plan with a HiveTableRelation logical operator).

Table 1. HiveTableScanExec’s Performance Metrics
Key Name (in web UI) Description


number of output rows

Table 2. HiveTableScanExec’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description


Hive’s Table metadata (converted from the CatalogTable of the HiveTableRelation)

Used when HiveTableScanExec is requested for the tableDesc, rawPartitions and is executed



Hive’s TableDesc

Creating HiveTableScanExec Instance

HiveTableScanExec takes the following when created:

HiveTableScanExec initializes the internal registries and counters.

Executing Physical Operator (Generating RDD[InternalRow]) — doExecute Method

doExecute(): RDD[InternalRow]
doExecute is part of SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. RDD[InternalRow]).


results matching ""

    No results matching ""