Research notes on making AI systems cheaper, faster, and more specialized.

Field notes from Understudy on model optimization, post-training, evals, agent workflows, and the path from production traces to specialist open models. For interactive lessons and demos, see Understudy University.

Index

Why Domain Experts, Not ML Teams, Define the Reward Signal

Expert feedback

Reward signals encode product judgment. ML teams can build the harness, but domain experts know which errors matter, which tradeoffs are acceptable, and what good work looks like.

2026-05-25 / 7 min

read →

Open Models Are Not Cheaper Until They Are Specialized

Model economics

Open weights only change the economics after the workflow has an eval, output contract, serving path, and enough repetition to amortize optimization.

2026-05-24 / 7 min

read →

The Optimization Ladder: Prompts, SFT, RL, and Routing

Model optimization

LLM optimization should climb from cheap control fixes to heavier training only when the eval proves the next step is worth it.

2026-05-23 / 7 min

read →

When to Replace a Frontier Model With a Specialist Model

Model economics

Frontier models are the right baseline for new workflows. Specialist models become attractive once the task repeats, the eval is stable, and the cost or latency curve starts limiting the product.

2026-05-22 / 7 min

read →

How Production Traces Become Evals

Evals

Production traces are not just logs. With the right capture, review, and holdout discipline, they become the evals that make model optimization safe.

2026-05-21 / 7 min

read →

How to Cut LLM Cost Without Making the Product Worse

Model economics

Cost reduction only matters when quality survives. Understudy's sentiment benchmark shows how a specialist open model can make warehouse-scale labeling viable without giving up frontier-style coverage.

2026-05-20 / 7 min

read →

Small Models Need Output Control Before Training

Evals

Understudy's operations benchmark showed that scaffolding and output control can make small models reliable before sparse fine-tuning work begins.

2026-05-19 / 7 min

read →

Self-Distillation Lets AI Teach Itself

Model optimization

Self-distillation turns rich feedback from compilers, users, and environments into model improvement instead of collapsing everything into a pass/fail reward.

2026-05-18 / 8 min

read →