A working glossary for model optimization.
Short definitions for the vocabulary behind Understudy: production traces, evals, post-training, routing, and specialist open models.
Model optimization
The work of improving an AI system for a specific production task by changing the route: harness, model, supply path, prompts, context, routing, fine-tuning, reinforcement learning, or serving strategy.
Route
The full execution path for a task: harness, model, and supply path. Two routes can use the same nominal model ID and still behave differently because templates, quantization, parsers, reasoning defaults, logprobs, batching, or schedulers differ.
Harness
The control surface around a model: prompt, schema, tool-call adapter, reasoning mode, token cap, scorer, retry policy, batching, context compaction, and parser.
Supply path
The serving path that provides inference for a route, such as a serverless API, dedicated deployment, customer cloud endpoint, or local runtime. Understudy measures the full route instead of assuming provider or model identity determines behavior.
Post-training
The stage after pretraining where a base model is adapted to follow instructions, use tools, reason through workflows, and match task-specific quality requirements.
Self-distillation
A training pattern where a model uses richer context or feedback to teach a future version of itself what it should have done on the original task.
Evals
Task-specific tests that measure whether a model did the work correctly enough to ship, replace a baseline, or receive more production traffic.
Production traces
Records of real model calls, tool calls, user corrections, outputs, failures, and review decisions from a live workflow.
Specialist open models
Open-weight models adapted to a bounded workflow so they can beat a general frontier model on cost, latency, reliability, or task-specific quality.
Frontier model replacement
The process of moving repeated production work from a general frontier API to a cheaper, faster, specialized route after the candidate passes held-out evals.
Routing
The serving decision that sends each request to the right model or prompt path based on task, risk, cost, latency, and measured quality.
Reward signal
The evidence a training or optimization loop uses to decide which behavior is better, often from expert review, verifier output, tests, or user corrections.