Claude makes it four: Opus 4 and Sonnet 4 arrive and change everything
Anthropic aims to turn artificial intelligence into a true working partner: more precise, more autonomous, more human.
3' min read
3' min read
The race to have the best performing gen Ai does not stop. Anthropic relaunches the challenge in the world of artificial intelligence with Claude Opus 4 and Claude Sonnet 4, the evolution of its proven models designed to tackle the most complex tasks - from software development to content generation and multi-step reasoning - marking a concrete leap forward towards its stated goal: to turn AI into a true virtual collaborator.
With Claude 4, Anthropic is aiming high: 'we want to set a new standard for man-machine collaboration'. And that is not just a claim. The new models are able to support prolonged activities, integrate external tools, maintain information consistency and solve problems on a large scale. In short: more reliable, more intelligent, more useful.
Claude Opus 4: the AI that programmes (better than most humans)
.Opus 4 is the flagship model and, according to Anthropic, the best coding model in the world. The benchmarks speak for themselves with 72.5% on SWE-bench Verified and 43.2% on Terminal-bench, results that place it at the top of international rankings for real programming tasks. In testing, he managed to work autonomously on a complex project for almost seven consecutive hours. A feat that has impressed companies such as Rakuten, Replit and Cursor, who describe it as a tool capable of writing code on multiple files, fixing bugs, following complex instructions and maintaining consistency on complex projects.
Claude Sonnet 4: controlled power, refined thinking
.Claude Sonnet 4 also makes a quantum leap over its predecessor, version 3.7. It scores 72.7 per cent on SWE-bench, responds more accurately to instructions, handles codebases more effectively and solves complex problems with more refined reasoning. GitHub has already integrated it into its new Copilot agent, while companies such as Sourcegraph, iGent and Augment Code emphasise its positive impact on code quality, navigation and autonomy in multifunctional tasks.
Both models are hybrid, i.e. capable of providing instantaneous answers or of activating a prolonged thinking mode, so-called 'extended thinking'. During this phase, models can access external tools, such as web searches or local files, alternating reasoning and action in a fluid and coordinated manner. Not only that, they can use multiple tools in parallel, improve responses and build a persistent memory. When authorised by developers, they are able to save and update relevant information, maintaining cognitive continuity over articulated projects and over time.

