DataWritingCommand Contract — Logical Commands That Write Query Data

DataWritingCommand is an extension of the Command contract for logical commands that write the result of executing query (query data) to a relation when executed.

DataWritingCommand is resolved to a DataWritingCommandExec physical operator when BasicOperators execution planning strategy is executed (i.e. plan a logical plan to a physical plan).

Table 1. DataWritingCommand Contract
Property Description

outputColumnNames

outputColumnNames: Seq[String]

The output column names of the analyzed input query plan

Used when DataWritingCommand is requested for the outputColumns

query

query: LogicalPlan

The analyzed logical query plan representing the data to write (i.e. whose result will be inserted into a relation)

Used when DataWritingCommand is requested for the child nodes and outputColumns.

run

run(
  sparkSession: SparkSession,
  child: SparkPlan): Seq[Row]

Executes the command to write query data (the result of executing structured query)

Used when:

When requested for the child nodes, DataWritingCommand simply returns the logical query plan.

DataWritingCommand defines custom performance metrics.

Table 2. DataWritingCommand’s Performance Metrics
Key Name (in web UI) Description

numFiles

number of written files

numOutputBytes

bytes of written output

numOutputRows

number of output rows

numParts

number of dynamic part

The performance metrics are used when:

Table 3. DataWritingCommands (Direct Implementations and Extensions Only)
DataWritingCommand Description

CreateDataSourceTableAsSelectCommand

CreateHiveTableAsSelectCommand

InsertIntoHadoopFsRelationCommand

SaveAsHiveFile

Commands that write query result as Hive files (i.e. InsertIntoHiveDirCommand and InsertIntoHiveTable)

basicWriteJobStatsTracker Method

basicWriteJobStatsTracker(hadoopConf: Configuration): BasicWriteJobStatsTracker

basicWriteJobStatsTracker simply creates and returns a new BasicWriteJobStatsTracker (with the given Hadoop Configuration and the metrics).

Note

basicWriteJobStatsTracker is used when:

Output Columns — outputColumns Method

outputColumns: Seq[Attribute]

outputColumns…​FIXME

Note

outputColumns is used when:

results matching ""

    No results matching ""