Family 1 of 4

Classical ML.

If your business has a table, this is your AI. The largest and quietest family in production.

Thesis

Classical machine learning is the workhorse of production AI in 2026. Linear and logistic regression, decision trees, random forests, gradient boosting. On tabular data, it beats deep learning and beats large language models on every dimension that matters in production: accuracy, latency, cost, interpretability, auditability.

Classical ML is old technology that LLMs replaced.Classical ML is the right tool for tabular data, and most production AI runs on tabular data.

How it works, in one paragraph

You have 100,000 historical examples. Each row is a customer, a transaction, a sensor reading. One column is the label: whether the customer churned, whether the transaction was fraud. The model learns the pattern that connects the input columns to the label column. At prediction time, you hand it a new row and it returns a number or a class. That is the entire mechanism.

The three methods that cover 95%

Linear and logistic regression. Fast, interpretable, the right baseline. Always start here.
Decision trees and random forests.The model asks yes/no questions in sequence. A “forest” votes across hundreds of trees. Robust and explainable.
Gradient boosting (XGBoost, LightGBM, CatBoost). Trees built one at a time, each fixing the last one’s mistakes. The default winning method on tabular data since ~2014. This is what your data scientist actually uses.

The decision rule

If your problem has...	Family 1?
Tabular data (rows + columns)	Yes
Sub-100 ms latency budget	Yes
Regulatory or auditing requirement	Yes
Output is a number or a class	Yes
Less than ~1M training examples	Yes
Output is free-form text	No (Family 3)
Input is messy unstructured text or images	No (Family 2 or 3)

When NOT to use it

Family 1 cannot generate text. It cannot caption an image. It cannot answer a question it has not been specifically trained for. If your problem is communication-shaped (a chatbot, a summary, a report), you are looking at Family 3.

Family 1 also cannot extrapolate beyond its training distribution. If you train on 2020-2024 data and the world changes, the model degrades silently. Detect drift via monitoring; retrain quarterly.

The hidden superpower: feature engineering

In Family 1, the data scientist’s real job is not picking the model. It is engineering the right input features. Days since last purchase, average order value over 30 days, ratio of returns to orders. These derived features are where the accuracy comes from. A junior who can run XGBoost is plentiful. A senior who knows what features to build is rare and worth five times the rate.

Named exemplars

Bank fraud detection. Gradient boosting on engineered features, sub-50ms decisions, GDPR-explainable via SHAP.
B2B churn prediction. Logistic regression baseline plus XGBoost lift. See the four-families-of-ai notebook for a live runthrough.
Insurance pricing. Generalized linear models, regulator-mandated interpretability.
Manufacturing defect detection. Gradient boosting on sensor telemetry.

For your teamThree questions to ask before they reach for an LLM: Is the input a table? Is the output a number or class? Do we need to explain the prediction? Yes to all three means Family 1.