val dataset: Dataset[Token] = ...
scala> val tokensByName = dataset.groupByKey(_.name)
tokensByName: org.apache.spark.sql.KeyValueGroupedDataset[String,Token] = org.apache.spark.sql.KeyValueGroupedDataset@1e3aad46
KeyValueGroupedDataset — Typed Grouping
KeyValueGroupedDataset
is an experimental interface to calculate aggregates over groups of objects in a typed Dataset.
Note
|
RelationalGroupedDataset is used for untyped Row -based aggregates.
|
KeyValueGroupedDataset
is created using Dataset.groupByKey operator.
Operator | Description |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
KeyValueGroupedDataset
holds keys
that were used for the object.
scala> tokensByName.keys.show
+-----+
|value|
+-----+
| aaa|
| bbb|
+-----+
aggUntyped
Internal Method
aggUntyped(columns: TypedColumn[_, _]*): Dataset[_]
aggUntyped
…FIXME
Note
|
aggUntyped is used exclusively when KeyValueGroupedDataset.agg typed operator is used.
|