package org.apache.spark.sql.execution.joins
trait HashedRelation extends KnownSizeEstimation {
// only required methods that have no implementation
// the others follow
def asReadOnlyCopy(): HashedRelation
def close(): Unit
def get(key: InternalRow): Iterator[InternalRow]
def getAverageProbesPerLookup: Double
def getValue(key: InternalRow): InternalRow
def keyIsUnique: Boolean
}
HashedRelation
HashedRelation
is the contract for "relations" with values hashed by some key.
HashedRelation
is a KnownSizeEstimation.
Note
|
HashedRelation is a private[execution] contract.
|
Method | Description | ||
---|---|---|---|
Gives a read-only copy of this Used exclusively when |
|||
Gives internal rows for the given key or Used when |
|||
Gives the value internal row for a given key
|
|||
Used when…FIXME |
getValue
Method
getValue(key: Long): InternalRow
Note
|
This is getValue that takes a long key. There is the more generic getValue that takes an internal row instead.
|
getValue
simply reports an UnsupportedOperationException
(and expects concrete HashedRelations
to provide a more meaningful implementation).
Note
|
getValue is used exclusively when LongHashedRelation is requested to get the value for a given key.
|
Creating Concrete HashedRelation Instance (for Build Side of Hash-based Join) — apply
Factory Method
apply(
input: Iterator[InternalRow],
key: Seq[Expression],
sizeEstimate: Int = 64,
taskMemoryManager: TaskMemoryManager = null): HashedRelation
apply
creates a LongHashedRelation when the input key
collection has a single expression of type long or UnsafeHashedRelation otherwise.
Note
|
The input
|
Note
|
|