implicit def localSeqToDatasetHolder[T : Encoder](s: Seq[T]): DatasetHolder[T]
implicits Object — Implicits Conversions
implicits
object gives implicit conversions for converting Scala objects (incl. RDDs) into a Dataset
, DataFrame
, Columns
or supporting such conversions (through Encoders).
Name | Description |
---|---|
|
Creates a DatasetHolder with the input |
Encoders |
Encoders for primitive and object types in Scala and Java (aka boxed types) |
|
Converts
|
|
|
|
|
implicits
object is defined inside SparkSession and hence requires that you build a SparkSession instance first before importing implicits
conversions.
import org.apache.spark.sql.SparkSession
val spark: SparkSession = ...
import spark.implicits._
scala> val ds = Seq("I am a shiny Dataset!").toDS
ds: org.apache.spark.sql.Dataset[String] = [value: string]
scala> val df = Seq("I am an old grumpy DataFrame!").toDF
df: org.apache.spark.sql.DataFrame = [value: string]
scala> val df = Seq("I am an old grumpy DataFrame with text column!").toDF("text")
df: org.apache.spark.sql.DataFrame = [text: string]
val rdd = sc.parallelize(Seq("hello, I'm a very low-level RDD"))
scala> val ds = rdd.toDS
ds: org.apache.spark.sql.Dataset[String] = [value: string]
Tip
|
In Scala REPL-based environments, e.g. |
scala> :help imports
show import history, identifying sources of names
scala> :imports
1) import org.apache.spark.SparkContext._ (69 terms, 1 are implicit)
2) import spark.implicits._ (1 types, 67 terms, 37 are implicit)
3) import spark.sql (1 terms)
4) import org.apache.spark.sql.functions._ (354 terms)
implicits
object extends SQLImplicits
abstract class.
DatasetHolder
Scala Case Class
DatasetHolder
is a Scala case class that, when created, takes a Dataset[T]
.
DatasetHolder
is created (implicitly) when rddToDatasetHolder and localSeqToDatasetHolder implicit conversions are used.
DatasetHolder
has toDS
and toDF
methods that simply return the Dataset[T] (it was created with) or a DataFrame
(using Dataset.toDF operator), respectively.
toDS(): Dataset[T]
toDF(): DataFrame
toDF(colNames: String*): DataFrame