Anthropic introduces Claude Sonnet 5 as a more affordable method for operating agents

Anthropic introduces Claude Sonnet 5 as a more affordable method for operating agents

As the capacity for agentic functionality becomes an essential requirement among foundation model companies, Anthropic is unveiling Claude Sonnet 5, a more robust and agentic iteration of the lab’s mid-tier model. 

“It is capable of planning, utilizing tools such as browsers and terminals, and operating independently at a level that, only a few months back, necessitated larger and more costly models,” Anthropic noted in a blog entry. 

This perspective aligns with what OpenAI and Google have articulated regarding their latest releases. OpenAI’s GPT-5.6 Sol was introduced in preview last week as the firm’s most agentic model to date, enabling users to delegate tasks across subagents for extended autonomous projects. Google’s Gemini 3.5 Flash, launched in May, was promoted as a transition from a conversational chatbot to an agentic tool that strategizes, constructs, and iterates on actual tasks with minimal human involvement.

The emphasis on Sonnet 5 indicates that agentic capability is now the baseline expectation across all pricing levels. The distinguishing factor will no longer be who excels at agentic work, but rather how affordably and reliably they can execute it without human supervision.  

Sonnet 5 claims to deliver performance nearly on par with Opus 4.8, but at significantly reduced costs. Starting Tuesday, Claude Sonnet 5 will be the standard model for both free and Pro plans and is accessible for every subscription type.

At its launch, Sonnet 5 is set at $2 per million input tokens and $10 per million output tokens until August 31, after which the price will rise to $3 per million input tokens and remain at $10 per million output tokens. This pricing positions Sonnet 5 as less expensive than Opus 4.8, as well as OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro. (It remains pricier than Gemini 3.5 Flash.)

The updated model also shows considerable enhancements over its forerunner Sonnet 4.6, released in February, particularly in agentic performance metrics such as reasoning, tool usage, software development, and knowledge tasks, according to Anthropic. 

For instance, on one benchmark, Sonnet 5 achieves 63.2% in agentic coding, compared to Opus 4.8’s 69.2% and Sonnet 4.6’s 58.1%. On a knowledge work benchmark, Sonnet 5 actually slightly surpasses Opus 4.8, which is renowned for excelling at resolving complex issues like nuanced judgments and in-depth research. 

“Opus 4.8 remains the preferred model for achieving higher accuracy on these challenges, but Sonnet 5 offers developers more affordable options that are significantly superior to those previously available,” Anthropic states. “Between Sonnet 5 and Opus 4.8, users can modify the effort level to strike the right balance between cost and efficacy.”

Testers referenced in the blog post indicate that Sonnet 5 also excels at completing intricate tasks where earlier model versions would have faltered and “reviews its own output without being explicitly instructed.”

“We assigned Claude Sonnet 5 a two-part task — updating Salesforce account tiers and sending a launch announcement to enterprise contacts — and it completed the entire process,” Daniel Shepard, a senior engineer at Zapier, commented. “That would have stalled midway before. For everyday automation, it’s an obvious choice.”

In terms of safety, Sonnet 5 also exhibits a lower frequency of “undesirable behaviors” such as collusion with misuse and deception compared to its predecessor, making it safer for use in agentic environments. It is more adept at declining harmful requests and evading hijacking attempts in prompt-injection assaults. Additionally, it hallucinates and engages in sycophantic behavior less frequently than Sonnet 4.6.

However, it does not match the capabilities of Opus 4.8 and Claude Mythos Preview regarding misaligned behavior. “Evaluations indicate that it significantly lags in the ability to execute dangerous cybersecurity tasks compared to our current Opus models,” the blog post states.

Lovable co-founder Fabian Hedin remarked that Claude Sonnet 5 “consistently and effectively declines unsafe requests.”

“At Lovable, we’re equipping millions of creators with powerful tools,” Hedin mentioned. “A model that knows when to decline is just as critical as one that understands how to construct.”

When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.

Leave a Reply