Claude Sonnet 5: what's new, the advantages, and a price comparison

Anthropic has just released Claude Sonnet 5, the new generation of its mid-range model. It’s a direct upgrade from Sonnet 4.6, at the same price per token, with clear gains on coding and agentic tasks. Here’s what actually changes, and what it costs once the details are taken into account.

The new model at a glance

Claude Sonnet 5 (API identifier claude-sonnet-5) is positioned as “the best combination of speed and intelligence”. Its main characteristics:

A one-million-token context window by default, with no smaller variant.
128k output tokens maximum.
Adaptive thinking enabled by default.
The same set of tools and features as Sonnet 4.6.

Anthropic presents Sonnet 5 as a step up in capability, with the most noticeable progress on development and automated (agentic) tasks. It’s also an option for workloads that need more than Sonnet 4.6 without stepping up to a more expensive Opus-class model.

What actually changes

Adaptive thinking on by default

On Sonnet 4.6, a request with no thinking parameter ran without thinking. On Sonnet 5, the same request now triggers adaptive thinking, meaning the model adjusts its reasoning effort on its own based on how hard the task is. To turn it off, you now have to say so explicitly.

One thing to watch: since the max_tokens limit covers thinking plus the answer, you need to revisit that value for jobs that used to run without thinking, otherwise the output can get truncated.

A new tokenizer

This is the most important change for your budget. Sonnet 5 uses a new tokenizer: for the same text, it produces roughly 30% more tokens than Sonnet 4.6.

It isn’t an API change, and no code needs to be modified. But everything measured in tokens is affected: the usage counters, how much text the context window actually holds, and above all the cost of an equivalent request. I’ll come back to this below, because that’s where the real price hides.

Real-time cybersecurity safeguards

Sonnet 5 is the first Sonnet-class model with real-time cybersecurity safeguards. Requests involving prohibited or high-risk topics may be refused. The refusal isn’t a technical error: it comes back as a successful response with a dedicated stop reason, to be handled on the application side.

Sonnet 5 pricing

The price per token is unchanged compared with Sonnet 4.6. But Anthropic is applying a reduced launch rate until 31 August 2026, before reverting to the standard rate on 1 September 2026.

Period	Input (per million tokens)	Output (per million tokens)
Launch (until 31 August 2026)	$2	$10
Standard (from 1 September 2026)	$3	$15

Prompt caching and the Batch API follow the same logic. At the launch rate, a cache read costs $0.20 per million tokens, and the Batch API (asynchronous processing, 50% discount) drops to $1 input and $5 output.

Sonnet 5 against Opus 4.8 and Haiku 4.5

This is where the choice really gets made. Sonnet 5 sits in the middle of the range, between the fastest model and the most powerful one.

Model	Input / Output (per M tokens)	Context	Max output	Positioning
Opus 4.8	$5 / $25	1M	128k	Complex reasoning, agentic coding
Sonnet 5	$3 / $15 ($2 / $10 at launch)	1M	128k	Balanced speed and intelligence
Haiku 4.5	$1 / $5	200k	64k	Fastest, simple tasks

Price comparison per million tokens between Opus 4.8, Sonnet 5 and Haiku 4.5, for input and output. — Price per million tokens, input and output, at standard rates. Sonnet 5 has an introductory rate of $2 / $10 until 31 August 2026.

In practice: Sonnet 5 costs nearly half as much as Opus 4.8 on output, while offering the same one-million-token context window. Haiku 4.5 remains the economical choice for simple, high-volume tasks, but caps out at 200k tokens of context.

The real cost: don’t forget the tokenizer

Here’s the trap to avoid. The price per token is identical to Sonnet 4.6, but the new tokenizer generates roughly 30% more tokens for the same text. An identical request can therefore cost more, even though the rate card hasn’t moved.

The launch rate ($2 / $10) more than offsets this effect until 31 August 2026. After that date, at the standard rate ($3 / $15), you need to factor the higher token count into your estimates. The right reflex: recount your prompts with the model’s token-counting tool rather than reusing figures measured on a previous version.

For developers: three migration points

Moving from Sonnet 4.6 to Sonnet 5 is a drop-in replacement (you just change the model identifier), but three behaviours have changed:

Manual extended thinking is removed: it now returns an error. Use adaptive thinking and the effort parameter instead.
Sampling parameters (temperature, top_p, top_k) can no longer be changed: any non-default value returns an error. Steer the model through system instructions.
Token budgets need reviewing, because of the new tokenizer.

Which model to choose

The rule still holds: Haiku for simple, high-volume tasks, Sonnet for most production workloads, Opus for the most complex reasoning. Sonnet 5 reinforces that central position by offering more capability at the same price, which makes it a reasonable default for most integrations.

If you’re considering integrating a Claude model into a product or a website, choosing the model and keeping token costs under control are among the decisions that weigh on the real budget. Let’s talk: I’ll help you scope the integration and size the spend before you commit.