What are Reasoning Models?
Reasoning models are language models trained to spend extra computation 'thinking' before they answer — generating internal reasoning steps to solve harder problems in math, code and logic. They trade latency and cost for accuracy on complex, multi-step tasks. The key idea is test-time compute: letting a model reason longer at inference, rather than only making the model bigger, can substantially improve results.
Definition
Reasoning models are language models optimized to perform extended step-by-step reasoning at inference time — using additional test-time compute — to improve accuracy on complex, multi-step problems.
Key takeaways
- They 'think' before answering, using extra inference compute.
- Test-time compute is a new scaling axis beyond model size.
- Best for math, code, logic and multi-step planning.
- They trade latency and token cost for accuracy.
- Overkill for simple tasks — match the model to the problem.
Context
Standard models answer in roughly constant time regardless of difficulty. Reasoning models break that: they generate a chain of internal reasoning, effectively spending more compute on harder questions, which lifts performance on tasks that need multi-step deduction.
This introduced a second scaling axis. Beyond making models larger (train-time compute), you can let them reason longer at inference (test-time compute) — a major driver of recent progress on hard benchmarks.
Architecture
Reasoning models are typically trained to produce long internal reasoning before a final answer, often reinforced with reinforcement learning that rewards correct outcomes. At inference, more 'thinking' tokens generally mean better answers on hard problems.
In agentic systems, reasoning models serve as strong planners and decision-makers, while cheaper, faster models can handle routine steps. Routing between them by task difficulty is a common cost-control pattern.
Components
Benefits
- Higher accuracy on complex, multi-step problems.
- Strong at math, coding and planning.
- Reasoning effort can be scaled per query.
- Good planners at the core of capable agents.
Risks
- Higher latency and token cost.
- Overkill — and wasteful — for simple tasks.
- Longer reasoning is not always more correct.
- Internal reasoning can be hard to audit or trust verbatim.
Tools & technologies
Examples
- Solving a multi-step math or logic problem that trips up a standard model.
- Planning a complex agent task before execution.
- Routing only hard tickets to a reasoning model to control cost.
FAQs
- How are reasoning models different from standard LLMs?
- They are trained and configured to reason at length before answering, spending more inference compute on hard problems instead of replying in near-constant time.
- What is test-time compute?
- Computation spent at inference (the model 'thinking' longer), as opposed to train-time compute spent making the model. It is a distinct way to improve results.
- Should I always use a reasoning model?
- No. They cost more and add latency. Use them for hard, multi-step problems and route simpler tasks to faster, cheaper models.
- Do they eliminate hallucination?
- No. Reasoning improves accuracy on many tasks but does not guarantee correctness; grounding, tools and evaluation remain necessary.