PipelineStage — ML Pipeline Component

The PipelineStage abstract class represents a single stage in a Pipeline.

PipelineStage has the following direct implementations (of which few are abstract classes, too):

Each PipelineStage transforms schema using transformSchema family of methods:

transformSchema(schema: StructType): StructType
transformSchema(schema: StructType, logging: Boolean): StructType
StructType describes a schema of a DataFrame.

Enable DEBUG logging level for the respective PipelineStage implementations to see what happens beneath.

