ExternalAppendOnlyUnsafeRowArray — Append-Only Array for UnsafeRows (with Disk Spill Threshold)

ExternalAppendOnlyUnsafeRowArray is an append-only array for UnsafeRows that spills content to disk when a predefined spill threshold of rows is reached.

Note
Choosing a proper spill threshold of rows is a performance optimization.

ExternalAppendOnlyUnsafeRowArray is created when:

  • WindowExec physical operator is executed (and creates an internal buffer for window frames)

  • WindowFunctionFrame is prepared

  • SortMergeJoinExec physical operator is executed (and creates a RowIterator for INNER and CROSS joins) and for getBufferedMatches

  • SortMergeJoinScanner creates an internal bufferedMatches

  • UnsafeCartesianRDD is computed

Table 1. ExternalAppendOnlyUnsafeRowArray’s Internal Registries and Counters
Name Description

initialSizeOfInMemoryBuffer

FIXME

Used when…​FIXME

inMemoryBuffer

FIXME

Can grow up to numRowsSpillThreshold rows (i.e. new UnsafeRows are added)

Used when…​FIXME

spillableArray

UnsafeExternalSorter

Used when…​FIXME

numRows

Used when…​FIXME

modificationsCount

Used when…​FIXME

numFieldsPerRow

Used when…​FIXME

Tip

Enable INFO logging level for org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray=INFO

Refer to Logging.

generateIterator Method

generateIterator(): Iterator[UnsafeRow]
generateIterator(startIndex: Int): Iterator[UnsafeRow]
Caution
FIXME

add Method

add(unsafeRow: UnsafeRow): Unit
Caution
FIXME
Note

add is used when:

clear Method

clear(): Unit
Caution
FIXME

Creating ExternalAppendOnlyUnsafeRowArray Instance

ExternalAppendOnlyUnsafeRowArray takes the following when created:

ExternalAppendOnlyUnsafeRowArray initializes the internal registries and counters.

results matching ""

    No results matching ""