Why scikit learn's fit transform is probably not for you

Stéphan Tulkens· May 22, 2026 · 4:31 AM UTC ·4 min read · 0 reactions · 0 comments · 10 views

via

Stéphan Tulkens

⚡ TL;DR · AI summary

The article discusses the fit transform paradigm in scikit-learn and its potential drawbacks. It argues that this approach conflates the roles of object creation and usage, which may not suit all codebases. The author suggests separating the factory and object into distinct classes for cleaner code.

Key facts

▪Scikit-learn's fit transform paradigm allows for zero-cost chaining of transformations.
▪The article critiques how this paradigm mixes object instantiation with usage, potentially complicating code.
▪The author proposes a design change by separating the factory and the object into two distinct classes.

Original article

Stéphan Tulkens · Stéphan Tulkens

Read full at Stéphan Tulkens →

Opening excerpt (first ~120 words) tap to expand

Scikit-learn's fit transform paradigm is probably not for you python | May 17, 2026 If you’ve ever used code from scikit-learn, you will have seen the following pattern: import numpy as np from sklearn.preprocessing import StandardScaler X = np.random.randn((100, 32)) scaler = StandardScaler() scaler.fit(X) X_transformed = scaler.transform(X) # Or equivalently X_transformed = scaler.fit_transform(X) For all scikit-learn transformers (1), the fit call sets the internal state of the object, while the transform call uses the set internal state to transform some data into something else. (2) This paradigm is really useful because it allows for zero-cost chaining: any sequence of transformations can be fit_transformed by simply calling fit_transform on all transformations in sequence.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at Stéphan Tulkens.

Anonymous · no account needed

Discussion

0 comments

Why scikit learn's fit transform is probably not for you

Discussion

More from Stéphan Tulkens