Claude Sonnet 5: what's new, the advantages, and a price comparison
Anthropic ships Claude Sonnet 5: adaptive thinking on by default, a new tokenizer, a one-million-token context window. A look at the real advantages and the pricing, compared with Opus 4.8 and Haiku 4.5.
Anthropic has just released Claude Sonnet 5, the new generation of its mid-range model. It’s a direct upgrade from Sonnet 4.6, at the same price per token, with clear gains on coding and agentic tasks. Here’s what actually changes, and what it costs once the details are taken into account.
The new model at a glance
Claude Sonnet 5 (API identifier claude-sonnet-5) is positioned as “the best combination of speed and intelligence”. Its main characteristics:
- A one-million-token context window by default, with no smaller variant.
- 128k output tokens maximum.
- Adaptive thinking enabled by default.
- The same set of tools and features as Sonnet 4.6.
Anthropic presents Sonnet 5 as a step up in capability, with the most noticeable progress on development and automated (agentic) tasks. It’s also an option for workloads that need more than Sonnet 4.6 without stepping up to a more expensive Opus-class model.
What actually changes
Adaptive thinking on by default
On Sonnet 4.6, a request with no thinking parameter ran without thinking. On Sonnet 5, the same request now triggers adaptive thinking, meaning the model adjusts its reasoning effort on its own based on how hard the task is. To turn it off, you now have to say so explicitly.
One thing to watch: since the max_tokens limit covers thinking plus the answer, you need to revisit that value for jobs that used to run without thinking, otherwise the output can get truncated.
A new tokenizer
This is the most important change for your budget. Sonnet 5 uses a new tokenizer: for the same text, it produces roughly 30% more tokens than Sonnet 4.6.
It isn’t an API change, and no code needs to be modified. But everything measured in tokens is affected: the usage counters, how much text the context window actually holds, and above all the cost of an equivalent request. I’ll come back to this below, because that’s where the real price hides.
Real-time cybersecurity safeguards
Sonnet 5 is the first Sonnet-class model with real-time cybersecurity safeguards. Requests involving prohibited or high-risk topics may be refused. The refusal isn’t a technical error: it comes back as a successful response with a dedicated stop reason, to be handled on the application side.
Sonnet 5 pricing
The price per token is unchanged compared with Sonnet 4.6. But Anthropic is applying a reduced launch rate until 31 August 2026, before reverting to the standard rate on 1 September 2026.
| Period | Input (per million tokens) | Output (per million tokens) |
|---|---|---|
| Launch (until 31 August 2026) | $2 | $10 |
| Standard (from 1 September 2026) | $3 | $15 |
Prompt caching and the Batch API follow the same logic. At the launch rate, a cache read costs $0.20 per million tokens, and the Batch API (asynchronous processing, 50% discount) drops to $1 input and $5 output.
Sonnet 5 against Opus 4.8 and Haiku 4.5
This is where the choice really gets made. Sonnet 5 sits in the middle of the range, between the fastest model and the most powerful one.
| Model | Input / Output (per M tokens) | Context | Max output | Positioning |
|---|---|---|---|---|
| Opus 4.8 | $5 / $25 | 1M | 128k | Complex reasoning, agentic coding |
| Sonnet 5 | $3 / $15 ($2 / $10 at launch) | 1M | 128k | Balanced speed and intelligence |
| Haiku 4.5 | $1 / $5 | 200k | 64k | Fastest, simple tasks |
In practice: Sonnet 5 costs nearly half as much as Opus 4.8 on output, while offering the same one-million-token context window. Haiku 4.5 remains the economical choice for simple, high-volume tasks, but caps out at 200k tokens of context.
The real cost: don’t forget the tokenizer
Here’s the trap to avoid. The price per token is identical to Sonnet 4.6, but the new tokenizer generates roughly 30% more tokens for the same text. An identical request can therefore cost more, even though the rate card hasn’t moved.
The launch rate ($2 / $10) more than offsets this effect until 31 August 2026. After that date, at the standard rate ($3 / $15), you need to factor the higher token count into your estimates. The right reflex: recount your prompts with the model’s token-counting tool rather than reusing figures measured on a previous version.
For developers: three migration points
Moving from Sonnet 4.6 to Sonnet 5 is a drop-in replacement (you just change the model identifier), but three behaviours have changed:
- Manual extended thinking is removed: it now returns an error. Use adaptive thinking and the effort parameter instead.
- Sampling parameters (
temperature,top_p,top_k) can no longer be changed: any non-default value returns an error. Steer the model through system instructions. - Token budgets need reviewing, because of the new tokenizer.
Which model to choose
The rule still holds: Haiku for simple, high-volume tasks, Sonnet for most production workloads, Opus for the most complex reasoning. Sonnet 5 reinforces that central position by offering more capability at the same price, which makes it a reasonable default for most integrations.
If you’re considering integrating a Claude model into a product or a website, choosing the model and keeping token costs under control are among the decisions that weigh on the real budget. Let’s talk: I’ll help you scope the integration and size the spend before you commit.