Anthroponic has released a new Cloud 3.5 Sonnet: a smart enough model to control your computer


Reddit users saw it first – the cloud suddenly became smarter, more capable. Now we know why: Anthroponic has released significant improvements to its AI models, including an improved Claude 3.5 Sonnet and an all-important upgrade to the lightweight Haiku model.

The most surprising update of all: These AIs can now physically control computers, move cursors, scroll through pages, and click buttons just like humans do.

In a video demonstration, anthropologist Sam Ringer showed how the cloud can scroll through a spreadsheet, search for company information in a CRM, and then populate it on an external website. Fields in Vol.

Available today with an API, developers can direct the cloud to use computers the way people do—by looking at a screen, moving a cursor, clicking buttons, and typing text. Cloud 3.5 Sonnet is the first frontier AI model to offer computing capabilities,” Anthropic said in an official announcement earlier today. “We're releasing early access to Compute for developer feedback, and we expect the capabilities to improve rapidly over time.”

bybit

It looks like Anthroponic (or its button-pushing AIs? Jk.) released the model before making the announcement. For hours, Claude and Anthropic's subreddits were flooded with people trying to figure out what the hell was going on because their AI was doing a great job: users reported it was fast, accurate, and surprisingly stopped being too apologetic.

“Cloud is back, much better. He just gets you, he responds with understanding instead of a dead, lifeless response,” NextGenAIUser said in a Reddit post. “Had been stuck for hours on a code issue using o1-mini and o1-preview, which was getting worse and worse responses. Gave the problem to Cloud at exactly the same speed and never had a problem,” Roth_Skyfire said in another comment.

And they were right. Anthropoc reported that Sonnet's improved Cloud 3.5 coding capabilities went from 33.4% to 49% in the SWE-bench Verified test, beating competitors like OpenAI's o1-preview. That's not just a little bump. Every single metric reported by Anthropic shows that the new Cloud 3.5 Sonnet is significantly better than the original model.

0Eb9A1B7D5Db74A6D21500E9F188C83Beef3842E 2601X1932 1 Scaled
Image: Anthroponic

But here's where things get really interesting. The improvised sonnet is not only clever; Now it can control your PC. Anthroponic calls this new feature “Computer Usage” and it's currently in public beta. The way it works, you give the cloud access to your desktop, and execute. The AI ​​then begins to act like a human using your computer on a remote desktop—moving the cursor, pressing keys, and typing commands and entering forms and text fields just like a human would.

However, this feature is only available through the API, so it's not something end users will be able to get used to anytime soon.

Anthroponic has trained Cloud to visually interpret what's happening on your screen. Developers can command it to perform tasks such as filling out forms, browsing web pages, or using software applications. It's a bit like letting your AI sit in front of your computer and do your work for you, except it's tireless and (hopefully) as erratic as we humans would like it to be.

The feature is in beta because it still crashes on some basics – scrolling and zooming can cause problems. That's why Anthroponic keeps a close eye on things, storing screenshots for at least 30 days and running security checks to catch any suspicious behavior.

The company's paranoia is well-founded. A few months ago, Microsoft introduced a feature called “Recall” that allows Copilot+ to take screenshots of users' computers to make the AI ​​more helpful and useful. There was so much fuss that Microsoft had to delay the plan after the Copilot+ Recall feature was deemed ‘spyware' – and authorities began investigating it.

But Anthroponic is made up of good people, and they promise to be different. “We found that the updated Cloud 3.5 Sonnet, including the new computing skills, is at AI security level 2 – that is, it does not require more advanced security and safety measures than we currently have,” the research team says.

Companies like Replit are integrating Claude's computer usability feature to help automate app reviews, while the browser company is experimenting with its ability to streamline web-based workflows. These early adopters are looking for ways to make the cloud perform dozens, if not hundreds, of manual tasks.

Also, Anthropic's budget-friendly model, the Cloud 3.5 Haiku is now on par with its previous flagship model, the Cloud 3 Opus. However, this model works at low cost and very low latency, which makes it more accessible without sacrificing much performance.

The Cloud 3.5 Haiku is particularly good at recording functionality and device usability, with a 40.6% score verified by SWE-bench. This puts it ahead of the most expensive models on the market, meaning builders on a budget don't have to compromise on quality.

Cloud 3.5 Haiku will be available in November.

Generally intelligent newspaper

A weekly AI journey narrated by a generative AI model.

Pin It on Pinterest