DataReaderFactory is a contract…​FIXME

package org.apache.spark.sql.sources.v2.reader;

public interface DataReaderFactory<T> extends Serializable {
  // only required methods that have no implementation
  // the others follow
  DataReader<T> createDataReader();

DataReaderFactory is an Evolving contract that is evolving towards becoming a stable API, but is not a stable API yet and can change from one feature release to another release.

In other words, using the contract is as treading on thin ice.

Table 1. DataReaderFactory Contract
Method Description


Used when…​FIXME

Specifying Preferred Locations —  preferredLocations Method

default String[] preferredLocations()

preferredLocations defaults to an empty collection of host names (as the preferred locations) which simply means that this task has no location preference.


preferredLocations is used when:

  • DataSourceRDD is requested for getPreferredLocations

  • RowToUnsafeRowDataReaderFactory is requested for preferredLocations

  • Spark Structured Streaming’s ContinuousDataSourceRDD is requested for getPreferredLocations

results matching ""

    No results matching ""