InsertIntoHadoopFsRelationCommand Logical Command

InsertIntoHadoopFsRelationCommand is a concrete DataWritingCommand that inserts the result of executing a query to an output path in the given FileFormat (and other properties).

InsertIntoHadoopFsRelationCommand is created when:

InsertIntoHadoopFsRelationCommand uses partitionOverwriteMode option that overrides spark.sql.sources.partitionOverwriteMode property for dynamic partition inserts.

Executing Logical Command — run Method

run(sparkSession: SparkSession, child: SparkPlan): Seq[Row]
Note
run is part of DataWritingCommand Contract to execute (run) a logical command to write query data

run uses the spark.sql.hive.manageFilesourcePartitions configuration property to…​FIXME

Caution
FIXME When is the catalogTable defined?
Caution
FIXME When is tracksPartitionsInCatalog of CatalogTable enabled?

run gets the partitionOverwriteMode option…​FIXME

For insertion, run simply requests the FileFormatWriter object to write and then…​FIXME (does some table-specific "tasks").

Otherwise (for non-insertion case), run simply prints out the following INFO message to the logs and finishes.

Skipping insertion into a relation that already exists.

Creating InsertIntoHadoopFsRelationCommand Instance

InsertIntoHadoopFsRelationCommand takes the following when created:

  • Output Hadoop’s Path

  • Static table partitions (Map[String, String])

  • ifPartitionNotExists flag

  • Partition columns (Seq[Attribute])

  • BucketSpec

  • FileFormat

  • Options (Map[String, String])

  • Logical plan

  • SaveMode

  • CatalogTable

  • FileIndex

  • Output column names

Note

staticPartitions may hold zero or more partitions as follows:

With that, staticPartitions are simply the partitions of an InsertIntoTable logical operator.

results matching ""

    No results matching ""