A working glossary for model optimization.

Short definitions for the vocabulary behind Understudy: production traces, evals, post-training, routing, and specialist open models.

Terms

Model optimization

The work of improving an AI system for a specific production task by changing the route: harness, model, supply path, prompts, context, routing, fine-tuning, reinforcement learning, or serving strategy.

bench research/the-optimization-ladder-prompts-sft-rl-and-routing research/self-distillation-lets-ai-teach-itself

Route

The full execution path for a task: harness, model, and supply path. Two routes can use the same nominal model ID and still behave differently because templates, quantization, parsers, reasoning defaults, logprobs, batching, or schedulers differ.

research/the-optimization-ladder-prompts-sft-rl-and-routing bench-operations compare

Harness

The control surface around a model: prompt, schema, tool-call adapter, reasoning mode, token cap, scorer, retry policy, batching, context compaction, and parser.

research/small-models-need-output-control-before-training bench-operations use-cases

Supply path

The serving path that provides inference for a route, such as a serverless API, dedicated deployment, customer cloud endpoint, or local runtime. Understudy measures the full route instead of assuming provider or model identity determines behavior.

research/when-to-replace-a-frontier-model-with-a-specialist-model compare bench-operations

Post-training

The stage after pretraining where a base model is adapted to follow instructions, use tools, reason through workflows, and match task-specific quality requirements.

research/self-distillation-lets-ai-teach-itself use-cases

Self-distillation

A training pattern where a model uses richer context or feedback to teach a future version of itself what it should have done on the original task.

research/self-distillation-lets-ai-teach-itself

Evals

Task-specific tests that measure whether a model did the work correctly enough to ship, replace a baseline, or receive more production traffic.

bench bench-operations compare

Production traces

Records of real model calls, tool calls, user corrections, outputs, failures, and review decisions from a live workflow.

use-cases contact

Specialist open models

Open-weight models adapted to a bounded workflow so they can beat a general frontier model on cost, latency, reliability, or task-specific quality.

bench bench-sentiment

Frontier model replacement

The process of moving repeated production work from a general frontier API to a cheaper, faster, specialized route after the candidate passes held-out evals.

compare bench-operations

Routing

The serving decision that sends each request to the right model or prompt path based on task, risk, cost, latency, and measured quality.

bench-operations use-cases

Reward signal

The evidence a training or optimization loop uses to decide which behavior is better, often from expert review, verifier output, tests, or user corrections.

research/self-distillation-lets-ai-teach-itself contact