CommitLog — HDFSMetadataLog for Batch Completion Log

CommitLog is a HDFSMetadataLog with metadata as regular text (i.e. String).

Note
HDFSMetadataLog is a MetadataLog that uses Hadoop HDFS for a reliable storage.

CommitLog is created along with StreamExecution.

add Method

add(batchId: Long): Unit

add…​FIXME

Note
add is used when…​FIXME

add Method

add(batchId: Long, metadata: String): Boolean
Note
add is part of MetadataLog Contract to…​FIXME.

add…​FIXME

serialize Method

serialize(metadata: String, out: OutputStream): Unit
Note
serialize is part of HDFSMetadataLog Contract to write a metadata in serialized format.

serialize writes out the version prefixed with v on a single line (e.g. v1) followed by the empty JSON (i.e. {}).

Note
The version in Spark 2.2 is 1 with the charset being UTF-8.
Note
serialize always writes an empty JSON as the name of the files gives the meaning.
$ ls -tr [checkpoint-directory]/commits
0 1 2 3 4 5 6 7 8 9

$ cat [checkpoint-directory]/commits/8
v1
{}

deserialize Method

deserialize(in: InputStream): String
Note
deserialize is part of HDFSMetadataLog Contract to…​FIXME.

deserialize…​FIXME

Creating CommitLog Instance

CommitLog takes the following when created:

  • SparkSession

  • Path of the metadata log directory

results matching ""

    No results matching ""