RelationProvider — Data Sources With Schema Inference

RelationProvider is a contract for data source providers that support schema inference (and also can be accessed using SQL’s USING clause, i.e. in CREATE TEMPORARY VIEW and DROP DATABASE DDL operators).

Schema inference is also called schema discovery.

RelationProvider is used exclusively when:

  • DataSource creates a BaseRelation (with no user-defined schema or the user-defined schema matches RelationProvider's)

BaseRelation models a collection of tuples from an external data source with a schema.
Table 1. RelationProvider’s Known Implementations
Name Description



Use SchemaRelationProvider for relation providers that require a user-defined schema.

RelationProvider Contract

package org.apache.spark.sql.sources

trait RelationProvider {
  def createRelation(
    sqlContext: SQLContext,
    parameters: Map[String, String]): BaseRelation
Table 2. RelationProvider Contract
Method Description


Accepts optional parameters (from SQL’s OPTIONS clause)

results matching ""

    No results matching ""