iterator: RecordReaderIterator[Text]
HadoopFileLinesReader
HadoopFileLinesReader
is a Scala Iterator of Apache Hadoop’s org.apache.hadoop.io.Text.
HadoopFileLinesReader
is created to access datasets in the following data sources:
-
SimpleTextSource
-
LibSVMFileFormat
-
TextInputCSVDataSource
-
TextInputJsonDataSource
HadoopFileLinesReader
uses the internal iterator that handles accessing files using Hadoop’s FileSystem API.
iterator
Internal Property
When created, HadoopFileLinesReader
creates an internal iterator
that uses Hadoop’s org.apache.hadoop.mapreduce.lib.input.FileSplit with Hadoop’s org.apache.hadoop.fs.Path and file.
iterator
creates Hadoop’s TaskAttemptID
, TaskAttemptContextImpl
and LineRecordReader
.
iterator
initializes LineRecordReader
and passes it on to a RecordReaderIterator.
Note
|
iterator is used for Iterator -specific methods, i.e. hasNext , next and close .
|