JdbcUtils Helper Object

JdbcUtils is a Scala object with methods to support JDBCRDD, JDBCRelation and JdbcRelationProvider.

Table 1. JdbcUtils API
Name Description

createConnectionFactory

Used when:

createTable

dropTable

getCommonJDBCType

getCustomSchema

Replaces data types in a table schema

Used exclusively when JDBCRelation is created (and customSchema JDBC option was defined)

getInsertStatement

getSchema

Used when JDBCRDD is requested to resolveTable

getSchemaOption

Used when JdbcRelationProvider is requested to write the rows of a structured query (a DataFrame) to a table

resultSetToRows

Used when…​FIXME

resultSetToSparkInternalRows

Used when JDBCRDD is requested to compute a partition

schemaString

saveTable

tableExists

Used when JdbcRelationProvider is requested to write the rows of a structured query (a DataFrame) to a table

truncateTable

Used when…​FIXME

createConnectionFactory Method

createConnectionFactory(options: JDBCOptions): () => Connection

createConnectionFactory…​FIXME

Note

createConnectionFactory is used when:

getCommonJDBCType Method

getCommonJDBCType(dt: DataType): Option[JdbcType]

getCommonJDBCType…​FIXME

Note
getCommonJDBCType is used when…​FIXME

getCatalystType Internal Method

getCatalystType(
  sqlType: Int,
  precision: Int,
  scale: Int,
  signed: Boolean): DataType

getCatalystType…​FIXME

Note
getCatalystType is used when…​FIXME

getSchemaOption Method

getSchemaOption(conn: Connection, options: JDBCOptions): Option[StructType]

getSchemaOption…​FIXME

Note
getSchemaOption is used when…​FIXME

getSchema Method

getSchema(
  resultSet: ResultSet,
  dialect: JdbcDialect,
  alwaysNullable: Boolean = false): StructType

getSchema…​FIXME

Note
getSchema is used when…​FIXME

resultSetToRows Method

resultSetToRows(resultSet: ResultSet, schema: StructType): Iterator[Row]

resultSetToRows…​FIXME

Note
resultSetToRows is used when…​FIXME

resultSetToSparkInternalRows Method

resultSetToSparkInternalRows(
  resultSet: ResultSet,
  schema: StructType,
  inputMetrics: InputMetrics): Iterator[InternalRow]

resultSetToSparkInternalRows…​FIXME

Note
resultSetToSparkInternalRows is used when…​FIXME

schemaString Method

schemaString(
  df: DataFrame,
  url: String,
  createTableColumnTypes: Option[String] = None): String

schemaString…​FIXME

Note
schemaString is used exclusively when JdbcUtils is requested to create a table.

parseUserSpecifiedCreateTableColumnTypes Internal Method

parseUserSpecifiedCreateTableColumnTypes(
  df: DataFrame,
  createTableColumnTypes: String): Map[String, String]

parseUserSpecifiedCreateTableColumnTypes…​FIXME

Note
parseUserSpecifiedCreateTableColumnTypes is used exclusively when JdbcUtils is requested to schemaString.

saveTable Method

saveTable(
  df: DataFrame,
  tableSchema: Option[StructType],
  isCaseSensitive: Boolean,
  options: JDBCOptions): Unit

saveTable takes the url, table, batchSize, isolationLevel options and createConnectionFactory.

saveTable getInsertStatement.

saveTable takes the numPartitions option and applies coalesce operator to the input DataFrame if the number of partitions of its RDD is less than the numPartitions option.

In the end, saveTable requests the possibly-repartitioned DataFrame for its RDD (it may have changed after the coalesce operator) and executes savePartition for every partition (using RDD.foreachPartition).

Note
saveTable is used exclusively when JdbcRelationProvider is requested to write the rows of a structured query (a DataFrame) to a table.

Replacing Data Types In Table Schema — getCustomSchema Method

getCustomSchema(
  tableSchema: StructType,
  customSchema: String,
  nameEquality: Resolver): StructType

getCustomSchema replaces the data type of the fields in the input tableSchema schema that are included in the input customSchema (if defined).

Internally, getCustomSchema branches off per the input customSchema.

If the input customSchema is undefined or empty, getCustomSchema simply returns the input tableSchema unchanged.

Otherwise, if the input customSchema is not empty, getCustomSchema requests CatalystSqlParser to parse it (i.e. create a new StructType for the given customSchema canonical schema representation).

getCustomSchema then uses SchemaUtils to checkColumnNameDuplication (in the column names of the user-defined customSchema schema with the input nameEquality).

In the end, getCustomSchema replaces the data type of the fields in the input tableSchema that are included in the input userSchema.

Note
getCustomSchema is used exclusively when JDBCRelation is created (and customSchema JDBC option was defined).

dropTable Method

dropTable(conn: Connection, table: String): Unit

dropTable…​FIXME

Note
dropTable is used when…​FIXME

Creating Table Using JDBC — createTable Method

createTable(
  conn: Connection,
  df: DataFrame,
  options: JDBCOptions): Unit

createTable builds the table schema (given the input DataFrame with the url and createTableColumnTypes options).

createTable uses the table and createTableOptions options.

In the end, createTable concatenates all the above texts into a CREATE TABLE [table] ([strSchema]) [createTableOptions] SQL DDL statement followed by executing it (using the input JDBC Connection).

Note
createTable is used exclusively when JdbcRelationProvider is requested to write the rows of a structured query (a DataFrame) to a table.

getInsertStatement Method

getInsertStatement(
  table: String,
  rddSchema: StructType,
  tableSchema: Option[StructType],
  isCaseSensitive: Boolean,
  dialect: JdbcDialect): String

getInsertStatement…​FIXME

Note
getInsertStatement is used when…​FIXME

getJdbcType Internal Method

getJdbcType(dt: DataType, dialect: JdbcDialect): JdbcType

getJdbcType…​FIXME

Note
getJdbcType is used when…​FIXME

tableExists Method

tableExists(conn: Connection, options: JDBCOptions): Boolean

tableExists…​FIXME

Note
tableExists is used exclusively when JdbcRelationProvider is requested to write the rows of a structured query (a DataFrame) to a table.

truncateTable Method

truncateTable(conn: Connection, options: JDBCOptions): Unit

truncateTable…​FIXME

Note
truncateTable is used exclusively when JdbcRelationProvider is requested to write the rows of a structured query (a DataFrame) to a table.

Saving Rows (Per Partition) to Table — savePartition Method

savePartition(
  getConnection: () => Connection,
  table: String,
  iterator: Iterator[Row],
  rddSchema: StructType,
  insertStmt: String,
  batchSize: Int,
  dialect: JdbcDialect,
  isolationLevel: Int): Iterator[Byte]

savePartition creates a JDBC Connection using the input getConnection function.

savePartition tries to set the input isolationLevel if it is different than TRANSACTION_NONE and the database supports transactions.

savePartition then writes rows (in the input Iterator[Row]) using batches that are submitted after batchSize rows where added.

Note
savePartition is used exclusively when JdbcUtils is requested to saveTable.

results matching ""

    No results matching ""