UnsafeFixedWidthAggregationMap

UnsafeFixedWidthAggregationMap is a tiny layer (extension) around Spark Core’s BytesToBytesMap to allow for UnsafeRow keys and values.

Whenever requested for performance metrics (i.e. average number of probes per key lookup and peak memory used), UnsafeFixedWidthAggregationMap simply requests the underlying BytesToBytesMap.

UnsafeFixedWidthAggregationMap is created when:

Table 1. UnsafeFixedWidthAggregationMap’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description

currentAggregationBuffer

Re-used pointer (as an UnsafeRow with the number of fields to match the aggregationBufferSchema) to the current aggregation buffer

Used exclusively when UnsafeFixedWidthAggregationMap is requested to getAggregationBufferFromUnsafeRow.

emptyAggregationBuffer

Empty aggregation buffer (encoded in UnsafeRow format)

groupingKeyProjection

UnsafeProjection for the groupingKeySchema (to encode grouping keys as UnsafeRows)

map

Spark Core’s BytesToBytesMap with the taskMemoryManager, initialCapacity, pageSizeBytes and performance metrics enabled

supportsAggregationBufferSchema Static Method

boolean supportsAggregationBufferSchema(StructType schema)

supportsAggregationBufferSchema is a predicate that is enabled (true) unless there is a field (in the fields of the input schema) whose data type is not mutable.

import org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap

import org.apache.spark.sql.types._
val schemaWithImmutableField = StructType(StructField("string", StringType) :: Nil)
assert(UnsafeFixedWidthAggregationMap.supportsAggregationBufferSchema(schemaWithImmutableField) == false)

val schemaWithMutableFields = StructType(
  StructField("int", IntegerType) :: StructField("bool", BooleanType) :: Nil)
assert(UnsafeFixedWidthAggregationMap.supportsAggregationBufferSchema(schemaWithMutableFields))
Note
supportsAggregationBufferSchema is used exclusively when HashAggregateExec is requested to supportsAggregate.

Creating UnsafeFixedWidthAggregationMap Instance

UnsafeFixedWidthAggregationMap takes the following when created:

  • Empty aggregation buffer (as an InternalRow)

  • Aggregation buffer schema

  • Grouping key schema

  • Spark Core’s TaskMemoryManager

  • Initial capacity

  • Page size (in bytes)

UnsafeFixedWidthAggregationMap initializes the internal registries and counters.

getAggregationBufferFromUnsafeRow Method

UnsafeRow getAggregationBufferFromUnsafeRow(UnsafeRow key) (1)
UnsafeRow getAggregationBufferFromUnsafeRow(UnsafeRow key, int hash)
  1. Uses the hash code of the key

getAggregationBufferFromUnsafeRow requests the BytesToBytesMap to lookup the input key (to get a BytesToBytesMap.Location).

getAggregationBufferFromUnsafeRow…​FIXME

Note

getAggregationBufferFromUnsafeRow is used when:

  • TungstenAggregationIterator is requested to processInputs (exclusively when TungstenAggregationIterator is created)

  • (for testing only) UnsafeFixedWidthAggregationMap is requested to getAggregationBuffer

getAggregationBuffer Method

UnsafeRow getAggregationBuffer(InternalRow groupingKey)

getAggregationBuffer…​FIXME

Note
getAggregationBuffer seems to be used exclusively for testing.

Getting KVIterator — iterator Method

KVIterator<UnsafeRow, UnsafeRow> iterator()

iterator…​FIXME

Note

iterator is used when:

  • HashAggregateExec physical operator is requested to finishAggregate

  • TungstenAggregationIterator is created (and pre-loads the first key-value pair from the map)

getPeakMemoryUsedBytes Method

long getPeakMemoryUsedBytes()

getPeakMemoryUsedBytes…​FIXME

Note

getPeakMemoryUsedBytes is used when:

getAverageProbesPerLookup Method

double getAverageProbesPerLookup()

getAverageProbesPerLookup…​FIXME

Note

getAverageProbesPerLookup is used when:

free Method

void free()

free…​FIXME

Note

free is used when:

destructAndCreateExternalSorter Method

UnsafeKVExternalSorter destructAndCreateExternalSorter() throws IOException

destructAndCreateExternalSorter…​FIXME

Note

destructAndCreateExternalSorter is used when:

results matching ""

    No results matching ""