Anthropic unveils new Claude AI models and ‘computer control’

Date:

Anthropic has announced upgrades to its AI portfolio, including an enhanced Claude 3.5 Sonnet model and the introduction of Claude 3.5 Haiku, alongside a “computer control” feature in public beta.

The upgraded Claude 3.5 Sonnet demonstrates substantial improvements across all metrics, with particularly notable advances in coding capabilities. The model achieved an impressive 49.0% on the SWE-bench Verified benchmark, surpassing all publicly available models, including OpenAI’s offerings and specialist coding systems.

In a pioneering development, Anthropic has introduced computer use functionality that enables Claude to interact with computers similarly to humans: viewing screens, controlling cursors, clicking, and typing. This capability, currently in public beta, marks Claude 3.5 Sonnet as the first frontier AI model to offer such functionality.

Several major technology firms have already begun implementing these new capabilities.

“The upgraded Claude 3.5 Sonnet represents a significant leap for AI-powered coding,” reports GitLab, which noted up to 10% stronger reasoning across use cases without additional latency.

The new Claude 3.5 Haiku model, set for release later this month, matches the performance of the previous Claude 3 Opus whilst maintaining cost-effectiveness and speed. It notably achieved 40.6% on SWE-bench Verified, outperforming many competitive models including the original Claude 3.5 Sonnet and GPT-4o.

Model benchmarks comparing new Claude AI models from Anthropic.
(Credit: Anthropic)

Regarding computer control capabilities, Anthropic has taken a measured approach, acknowledging current limitations whilst highlighting potential. On the OSWorld benchmark, which evaluates computer interface navigation, Claude 3.5 Sonnet achieved 14.9% in screenshot-only tests, significantly outperforming the next-best system’s 7.8%.

The developments have undergone rigorous safety evaluations, with pre-deployment testing conducted in partnership with both the US and UK AI Safety Institutes. Anthropic maintains that the ASL-2 Standard, as detailed in their Responsible Scaling Policy, remains appropriate for these models.

(Image Credit: Anthropic)

See also: IBM unveils Granite 3.0 AI models with open-source commitment

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: ai, anthropic, artificial intelligence, claude, haiku, llm, models, sonnet

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Share post:

spot_imgspot_img

Popular

More like this
Related

Alibaba Marco-o1: Advancing LLM reasoning capabilities

Alibaba has announced Marco-o1, a large language model (LLM) designed to tackle both conventional and open-ended problem-solving tasks.

New AI training techniques aim to overcome current challenges

OpenAI and other leading AI companies are developing new training techniques to overcome limitations of current methods. Addressing unexpected delays and complications in the development of larger, more powerful language models, these fresh techniques focus on human-like behaviour to teach algorithms to ‘think.

Ai2 OLMo 2: Raising the bar for open language models

Ai2 is releasing OLMo 2, a family of open-source language models that advances the democratisation of AI and narrows the gap between open and proprietary solutions.

Generative AI use soars among brits, but is it sustainable?

A survey by CloudNine PR shows that 83% of UK adults are aware of generative AI tools, and 45% of those familiar with them want companies to be transparent about the environmental costs associated with the technologies.