val q1 = spark.read.option("header", true).csv("../datasets/people.csv")
scala> println(q1.queryExecution.logical.numberedTreeString)
00 Relation[id#72,name#73,age#74] csv
val q2 = sql("select * from `csv`.`../datasets/people.csv`")
scala> println(q2.queryExecution.optimizedPlan.numberedTreeString)
00 Relation[_c0#175,_c1#176,_c2#177] csv
LogicalRelation Leaf Logical Operator — Representing BaseRelations in Logical Plan
LogicalRelation
is a leaf logical operator that represents a BaseRelation in a logical query plan.
LogicalRelation
is created when:
-
DataFrameReader
loads data from a data source that supports multiple paths (through SparkSession.baseRelationToDataFrame) -
DataFrameReader
is requested to load data from an external table using JDBC (through SparkSession.baseRelationToDataFrame) -
TextInputCSVDataSource
andTextInputJsonDataSource
are requested to infer schema -
ResolveSQLOnFile
converts a logical plan -
FindDataSourceTable
logical evaluation rule is executed -
RelationConversions logical evaluation rule is executed
-
CreateTempViewUsing
logical command is requested to run -
Structured Streaming’s
FileStreamSource
creates batches of records
The simple text representation of a LogicalRelation
(aka simpleString
) is Relation[output] [relation] (that uses the output and BaseRelation).
val q = spark.read.text("README.md")
val logicalPlan = q.queryExecution.logical
scala> println(logicalPlan.simpleString)
Relation[value#2] text
Creating LogicalRelation Instance
LogicalRelation
takes the following when created:
-
Optional CatalogTable
apply
Factory Utility
apply(
relation: BaseRelation,
isStreaming: Boolean = false): LogicalRelation
apply(
relation: BaseRelation,
table: CatalogTable): LogicalRelation
apply
creates a LogicalRelation
for the input BaseRelation (and CatalogTable or optional isStreaming
flag).
Note
|
|
refresh
Method
refresh(): Unit
Note
|
refresh is part of LogicalPlan Contract to refresh itself.
|
Note
|
refresh does the work for HadoopFsRelation relations only.
|