val dataset: Dataset[Token] = ...
scala> val tokensByName = dataset.groupByKey(_.name)
tokensByName: org.apache.spark.sql.KeyValueGroupedDataset[String,Token] = org.apache.spark.sql.KeyValueGroupedDataset@1e3aad46
KeyValueGroupedDataset — Typed Grouping
KeyValueGroupedDataset is an experimental interface to calculate aggregates over groups of objects in a typed Dataset.
|
Note
|
RelationalGroupedDataset is used for untyped Row-based aggregates.
|
KeyValueGroupedDataset is created using Dataset.groupByKey operator.
| Operator | Description |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
KeyValueGroupedDataset holds keys that were used for the object.
scala> tokensByName.keys.show
+-----+
|value|
+-----+
| aaa|
| bbb|
+-----+
aggUntyped Internal Method
aggUntyped(columns: TypedColumn[_, _]*): Dataset[_]
aggUntyped…FIXME
|
Note
|
aggUntyped is used exclusively when KeyValueGroupedDataset.agg typed operator is used.
|