Family 1 of 4

Classical ML.

If your business has a table, this is your AI. The largest and quietest family in production.

Thesis

Classical machine learning is the workhorse of production AI in 2026. Linear and logistic regression, decision trees, random forests, gradient boosting. On tabular data, it beats deep learning and beats large language models on every dimension that matters in production: accuracy, latency, cost, interpretability, auditability.

Classical ML is old technology that LLMs replaced.Classical ML is the right tool for tabular data, and most production AI runs on tabular data.

How it works, in one paragraph

You have 100,000 historical examples. Each row is a customer, a transaction, a sensor reading. One column is the label: whether the customer churned, whether the transaction was fraud. The model learns the pattern that connects the input columns to the label column. At prediction time, you hand it a new row and it returns a number or a class. That is the entire mechanism.

The three methods that cover 95%

  1. Linear and logistic regression. Fast, interpretable, the right baseline. Always start here.
  2. Decision trees and random forests.The model asks yes/no questions in sequence. A “forest” votes across hundreds of trees. Robust and explainable.
  3. Gradient boosting (XGBoost, LightGBM, CatBoost). Trees built one at a time, each fixing the last one’s mistakes. The default winning method on tabular data since ~2014. This is what your data scientist actually uses.

The decision rule

If your problem has...Family 1?
Tabular data (rows + columns)Yes
Sub-100 ms latency budgetYes
Regulatory or auditing requirementYes
Output is a number or a classYes
Less than ~1M training examplesYes
Output is free-form textNo (Family 3)
Input is messy unstructured text or imagesNo (Family 2 or 3)

When NOT to use it

Family 1 cannot generate text. It cannot caption an image. It cannot answer a question it has not been specifically trained for. If your problem is communication-shaped (a chatbot, a summary, a report), you are looking at Family 3.

Family 1 also cannot extrapolate beyond its training distribution. If you train on 2020-2024 data and the world changes, the model degrades silently. Detect drift via monitoring; retrain quarterly.

The hidden superpower: feature engineering

In Family 1, the data scientist’s real job is not picking the model. It is engineering the right input features. Days since last purchase, average order value over 30 days, ratio of returns to orders. These derived features are where the accuracy comes from. A junior who can run XGBoost is plentiful. A senior who knows what features to build is rare and worth five times the rate.

Named exemplars

For your teamThree questions to ask before they reach for an LLM: Is the input a table? Is the output a number or class? Do we need to explain the prediction? Yes to all three means Family 1.
Mario Deubler

If this matches what your team is hitting

Series A founders and Heads of Product working through these symptoms (teams shipping fast, numbers flat), talk to me. I run as Fractional Head of Product, embedded with your team. Lead and build, not PowerPoint.