WeSearch

Why scikit learn's fit transform is probably not for you

Stéphan Tulkens· ·4 min read · 0 reactions · 0 comments · 10 views
#programming#python#data science
⚡ TL;DR · AI summary

The article discusses the fit transform paradigm in scikit-learn and its potential drawbacks. It argues that this approach conflates the roles of object creation and usage, which may not suit all codebases. The author suggests separating the factory and object into distinct classes for cleaner code.

Key facts
Original article
Stéphan Tulkens · Stéphan Tulkens
Read full at Stéphan Tulkens →
Opening excerpt (first ~120 words) tap to expand

Scikit-learn's fit transform paradigm is probably not for you python | May 17, 2026 If you’ve ever used code from scikit-learn, you will have seen the following pattern: import numpy as np from sklearn.preprocessing import StandardScaler X = np.random.randn((100, 32)) scaler = StandardScaler() scaler.fit(X) X_transformed = scaler.transform(X) # Or equivalently X_transformed = scaler.fit_transform(X) For all scikit-learn transformers (1), the fit call sets the internal state of the object, while the transform call uses the set internal state to transform some data into something else. (2) This paradigm is really useful because it allows for zero-cost chaining: any sequence of transformations can be fit_transformed by simply calling fit_transform on all transformations in sequence.

Excerpt limited to ~120 words for fair-use compliance. The full article is at Stéphan Tulkens.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from Stéphan Tulkens