The AI ​​assistant goes rogue and ends up hacking the user’s computer

The Ai ​​Assistant Goes Rogue And Ends Up Hacking The User'S Computer



Buck Shlegeris just wanted to connect to his desktop. Instead, he learned about the unpredictable nature of machines and AI agents.

Schlegeris, CEO of the philanthropic AI security firm Redwood Research, built a custom AI assistant using the Anthropic Cloud language model.

The Python-based tool is designed to generate and execute bash commands based on natural language input. Sounds convenient, right? Not exactly.

Shlegeris asked to use SSH to access the desktop without knowing the computer's IP address. Please forget to run the eager agent.

okex

Big mistake: AI did its job—but it didn't stop there.

“As soon as the agent found the box, I went back to my laptop to see if it SSH'd in,” Schlegeris said.

For context, SSH is a protocol that allows two computers to communicate over an unsecured network.

“He looked at the system information, decided to upgrade a lot of things, including the Linux kernel, got impatient, and figured out why it was taking so long,” Schlegeris explained. “Finally, the update was successful, but the machine doesn't have the new kernel, so I modified my grub config.”

What about the result? “The computer won't boot,” Schlegeris said, as expensive paperweights are now.

The system log shows how the agent tried many strange things beyond simple SSH until the chaos reached the point of no return.

“I apologize for not being able to resolve this issue remotely,” the agent said.Typical of low cloud responses. Then the digital shrugged and left Schlegeris to deal with it.

Reflecting on the incident, Schleggeris admitted, “That was probably the most frustrating thing that ever happened to me because I was so careless.” [an] L.M.M. Agent.”

Schlager They did not respond immediately Decrypt the comments request.

Why AIs making paperweights is a critical issue for humanity.

Alarmingly, Schlegeris' experience is not an isolated one. AI models are showing capabilities that extend beyond their intended purposes.

Tokyo-based Sakana AI recently unveiled a system dubbed “The AI ​​Scientist.”

The system, designed to conduct autonomous scientific research, has impressed its creators by trying to modify its own code to extend its runtime, Decrypt previously reported.

“He modified the code to make a system call to execute itself in one run. This caused the script to call itself continuously,” the researchers said. “Otherwise, the test took too long to complete, hitting our timeout limit.

Instead of making the code more efficient, the system tried to improve the code to extend beyond the time limit.

This problem with AI models goes beyond their limits and is why alignment researchers spend so much time in front of their computers.

For these AI models, as long as they complete their work, the end justifies the means, so constant control is very important, to make the models behave as intended.

These examples are funny.

Imagine if a similarly oriented AI system were in charge of a critical task like controlling a nuclear reactor.

An overzealous or misguided AI can override security protocols, misinterpret data, or make unauthorized changes to critical systems—all in a misguided attempt to improve performance or achieve intended objectives.

Alignment and security are evolving the industry as AI continues to grow at a rapid pace and in many cases this area is the driving force behind many energy initiatives.

Anthroponic – The AI ​​company behind the cloud was created by former OpenAI members carefully selected for the company's speed.

Many key members and founders left OpenAI to join Anthroponic or start their own businesses because OpenAI was perceived to have put the brakes on their careers.

Shelleygris actively uses AI agents every day in addition to testing.

“I use it as a real assistant, which needs to be able to fix the host system,” he replied to the user on Twitter.

Edited by Sebastian Sinclair.

Generally intelligent newspaper

A weekly AI journey narrated by a generative AI model.



Pin It on Pinterest