add(
batchId: Long,
logs: Array[FileEntry]): Boolean
FileStreamSourceLog
FileStreamSourceLog
is a concrete CompactibleFileStreamLog (of FileEntry
metadata) of FileStreamSource.
FileStreamSourceLog
uses a fixed-size cache of metadata of compaction batches.
FileStreamSourceLog
uses spark.sql.streaming.fileSource.log.compactInterval configuration property (default: 10
) for the default compaction interval.
FileStreamSourceLog
uses spark.sql.streaming.fileSource.log.cleanupDelay configuration property (default: 10
minutes) for the fileCleanupDelayMs.
FileStreamSourceLog
uses spark.sql.streaming.fileSource.log.deletion configuration property (default: true
) for the isDeletingExpiredLog.
Creating FileStreamSourceLog Instance
FileStreamSourceLog
(like the parent CompactibleFileStreamLog) takes the following to be created:
Storing (Adding) Metadata of Streaming Batch — add
Method
Note
|
add is part of the MetadataLog Contract to store (add) metadata of a streaming batch.
|
add
requests the parent CompactibleFileStreamLog
to store metadata (possibly compacting logs if the batch is compaction).
If so (and this is a compation batch), add
adds the batch and the logs to fileEntryCache internal registry (and possibly removing the eldest entry if the size is above the cacheSize).
get
Method
get(
startId: Option[Long],
endId: Option[Long]): Array[(Long, Array[FileEntry])]
Note
|
get is part of the MetadataLog Contract to…FIXME.
|
get
…FIXME
Internal Properties
Name | Description |
---|---|
|
Size of the fileEntryCache that is exactly the compact interval Used when the fileEntryCache is requested to add a new entry in add and get a compaction batch |
|
Metadata of a streaming batch (
Used when get (for a compaction batch) |