// FIXME: ScalarSubquery in a logical plan
Subqueries (Subquery Expressions)
As of Spark 2.0, Spark SQL supports subqueries.
A subquery (aka subquery expression) is a query that is nested inside of another query.
There are the following kinds of subqueries:
A subquery as a source (inside a SQL
A scalar subquery or a predicate subquery (as a column)
Every subquery can also be correlated or uncorrelated.
A scalar subquery is a structured query that returns a single row and a single column only. Spark SQL uses ScalarSubquery (SubqueryExpression) expression to represent scalar subqueries (while parsing a SQL statement).
ScalarSubquery expression appears as scalar-subquery#[exprId] [conditionString] in a logical plan.
// FIXME: Name of a ScalarSubquery in a logical plan
It is said that scalar subqueries should be used very rarely if at all and you should join instead.
Catalyst Optimizer uses the following optimizations for subqueries:
FIXME Describe how a physical ScalarSubquery is executed (cf.