add(
batchId: Long,
logs: Array[FileEntry]): Boolean
FileStreamSourceLog
FileStreamSourceLog is a concrete CompactibleFileStreamLog (of FileEntry metadata) of FileStreamSource.
FileStreamSourceLog uses a fixed-size cache of metadata of compaction batches.
FileStreamSourceLog uses spark.sql.streaming.fileSource.log.compactInterval configuration property (default: 10) for the default compaction interval.
FileStreamSourceLog uses spark.sql.streaming.fileSource.log.cleanupDelay configuration property (default: 10 minutes) for the fileCleanupDelayMs.
FileStreamSourceLog uses spark.sql.streaming.fileSource.log.deletion configuration property (default: true) for the isDeletingExpiredLog.
Creating FileStreamSourceLog Instance
FileStreamSourceLog (like the parent CompactibleFileStreamLog) takes the following to be created:
Storing (Adding) Metadata of Streaming Batch — add Method
|
Note
|
add is part of the MetadataLog Contract to store (add) metadata of a streaming batch.
|
add requests the parent CompactibleFileStreamLog to store metadata (possibly compacting logs if the batch is compaction).
If so (and this is a compation batch), add adds the batch and the logs to fileEntryCache internal registry (and possibly removing the eldest entry if the size is above the cacheSize).
get Method
get(
startId: Option[Long],
endId: Option[Long]): Array[(Long, Array[FileEntry])]
|
Note
|
get is part of the MetadataLog Contract to…FIXME.
|
get…FIXME
Internal Properties
| Name | Description |
|---|---|
|
Size of the fileEntryCache that is exactly the compact interval Used when the fileEntryCache is requested to add a new entry in add and get a compaction batch |
|
Metadata of a streaming batch (
Used when get (for a compaction batch) |