HiveTableScanExec Leaf Physical Operator

HiveTableScanExec is a leaf physical operator that represents a HiveTableRelation logical operator at execution time.

HiveTableScanExec is created exclusively when HiveTableScans execution planning strategy plans a HiveTableRelation logical operator (i.e. is executed on a logical query plan with a HiveTableRelation logical operator).

Table 1. HiveTableScanExec’s Performance Metrics
Key Name (in web UI) Description

numOutputRows

number of output rows

Table 2. HiveTableScanExec’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description

hiveQlTable

Hive’s Table metadata (converted from the CatalogTable of the HiveTableRelation)

Used when HiveTableScanExec is requested for the tableDesc, rawPartitions and is executed

rawPartitions

tableDesc

Hive’s TableDesc

Creating HiveTableScanExec Instance

HiveTableScanExec takes the following when created:

HiveTableScanExec initializes the internal registries and counters.

Executing Physical Operator (Generating RDD[InternalRow]) — doExecute Method

doExecute(): RDD[InternalRow]
Note
doExecute is part of SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. RDD[InternalRow]).

doExecute…​FIXME

results matching ""

    No results matching ""