'I had to RUN to my Mac mini like I was defusing a bomb': OpenClaw AI chose to 'speedrun' deleting Meta AI safety director's inbox due to a 'rookie error'

themachinestops@lemmy.dbzer0.com · edit-2 2 days ago

'I had to RUN to my Mac mini like I was defusing a bomb': OpenClaw AI chose to 'speedrun' deleting Meta AI safety director's inbox due to a 'rookie error'

PointyFluff@lemmy.ml · 1 day ago

First of all. BULLSHIT. Second. why would you give a bot write-access to your filesystem.

rumba@lemmy.zip · 1 day ago

The idea is you give it shell access. Say use super coder agent bob johnson to write a thing that does x using this [framework], separate files by best practice for x y and z features, ask security agent OSO to look over the code and suggest changes, ask agent U.N.I.T to make unit tests, when the code looks good, run through the unit tests. If anything fails keep fixing and iterating until every thing passes. Create a README.MD for everything that was done, Create a TODO.MD for any future suggestions.

I’m simplifying, but this actually works to an extent. Each of the agents keep the context windows small, the whole thing stays sane and eventually nets some project that works. The downside is you end up giving it quite a bit of leeway to get the job done or you sit over it watching and authorizing it’s every move.

Kinda strange to see a safety director do that…

Regrettable_incident@lemmy.world · 1 day ago

And execs think we’re going to give these products our bank details and ask them to book flights and stuff. . ?

BanMe@lemmy.world · 1 day ago

Two years ago: “They expect us to rely on this for code that actually compiles?”

So yeah in another year or two what you describe will be common, sure.

OpenClaw is like the insane libertarian cousin of all the AI products tho, it’s bizarre that people are using this in production scenarios considering how it behaves.

lemmydividebyzero@reddthat.com · 2 days ago

They released a version recently that fixed over 60 security vulnerabilities. All of them were high or critical.

How many more are there to find? Thousands?

Whoever uses this on a PC with anything useful on it, is absolutely insane.

TonyTonyChopper@mander.xyz · 1 day ago

Thousands

Since LLMs are a black box there are an unlimited number of security vulnerabilities

BreadstickNinja@lemmy.world · 1 day ago

The idea that they’ve already deployed this in production is absolutely insane.

Echo Dot@feddit.uk · 2 days ago

Yep that’s about the level of intelligence I would expect from Meta’s AI safety director.

Doing the one thing that you’re never supposed to do, letting an AI loose on anything sensitive.

For her next trick she’s going to run while holding scissors in one hand and a bottle of boiling acid in the other. What could go wrong.

LastYearsIrritant@sopuli.xyz · 2 days ago

I love how these models apologize like they mean it. It doesn’t mean it. It doesn’t feel bad, and it will do it again.

Apologies mean “I made a mistake and I learned from it so it won’t repeat.”

Sure it claims it added more notes to it’s config, but if it ignored the rules before, what makes you think that new rules are going to change anything?

cv_octavio@piefed.ca · 19 hours ago

It doesn’t even want to ignore the rules. It doesn’t want anything. Just some math didn’t work out and a thing happened that wasn’t supposed to. It will absolutely happen again if it maths that way again too.

panda_abyss@lemmy.ca · 2 days ago

But it’s adding it to a text file that eats up a ton of tokens and routinely gets ignored!

BrianTheeBiscuiteer@lemmy.world · 2 days ago

That MEMORY. md file won’t do shit if the AI doesn’t read it.

I give it 2 hours before it stops reading it until prompted again.

bleistift2@sopuli.xyz · 2 days ago

Apologies mean “I made a mistake and I learned from it so it won’t repeat.”

I beg to differ. An apology means that you feel bad about harm inflicted upon others. To prove the point: You apologize when you’re late due to circumstances that are outside of your control. Or when you accidentally bump into someone on the bus when the driver slams the break.

sp3ctr4l@lemmy.dbzer0.com · 2 days ago

There are two kinds of apologies.

Customary, and Genuine.

They’re describing a genuine apology.

You’re describing a customary apology.

PancakesCantKillMe@lemmy.world · 2 days ago

“‘I’m sorry’ and ‘I apologize’ mean the same thing, except when you’re at a funeral”

Demetri Martin

atopi@piefed.blahaj.zone · 2 days ago

it is made to copy how humans write and speak

the AI had been scored for how good it learned from humans to sound sorry

Clent@lemmy.dbzer0.com · 2 days ago

They behave exactly a child does when a parent forces an apology.

They have the words they’re expect to say so they do say them but they don’t undersranr why, they definitely don’t mean it and they lack the restrain to not doing whatever they apologized for over and over.

frigge@lemmy.ml · 2 days ago

Apologies mean “I made a mistake and I learned from it so it won’t repeat.”

yeah enough humans don’t know that as well unfortunately. But yeah obviously LLMs don’t understand anything. That’s not how they work

prettybunnys@piefed.social · 2 days ago

Like an abusive relationship

fruitycoder@sh.itjust.works · 2 days ago

If anything its context includes that it makes mistakes now and details about them. The mostly output is to create the same mistakes again

🌞 Alexander Daychilde 🌞@lemmy.world · 2 days ago

Apologies mean “I made a mistake and I learned from it so it won’t repeat.”

At best it might not make the same mistake again if that memory is in the current context. But more likely: It will not remember.

Although latest Gemini in particular has much more room for “remembering” things, still.

But “I made a mistake”? It is not self-aware in any way shape or form to the degree where “I made a mistake” carries any real meaning.

sp3ctr4l@lemmy.dbzer0.com · 1 day ago

But… but… it generates text that seems like a human wrote it!

Therefore it must be a human!

… A whole lot of humans are failing a reverse turing test, just, fundamentally.

scarabic@lemmy.world · 2 days ago

deleted by creator

Zwuzelmaus@feddit.org · 2 days ago

Apologies mean “I made a mistake and I learned from it so it won’t repeat.”

If only some people meant it that way too!

Dultas@lemmy.world · 2 days ago

The S in OpenClaw stands for security.

LittleBorat3@lemmy.world · 2 days ago

The I’m sorry part is always great, I always wanted an apology by an LLM not that it works as specified 😆

It can be like your least competent colleague on roids

SaraTonin@lemmy.world · 1 day ago

“I promise it won’t happen again”

Really? Because you promised it wouldn’t happen in the first place. Now here we are…

panda_abyss@lemmy.ca · 2 days ago

If I was the director of AI safety, and I used AI to own and delete my inbox, I sure as shit would never tell a soul.

This is pure unbridled incompetence.

XLE@piefed.social · edit-2 2 days ago

The whole “AI safety” field is this incompetent. These people that will tell you AI is on the verge of creating a bioweapon, and then run random code in a command line. Completely and totally unserious.

Eufalconimorph@discuss.tchncs.de · 19 hours ago

The “AI safety” field is about two things: marketing AIs as so powerful that they’re risky to use but riskier to get left behind by competitors using, and keeping AIs from doing so much brand damage that stock price suffers. This story is about marketing an AI as powerful.

panda_abyss@lemmy.ca · 2 days ago

I don’t know what the hell has happened, but some of these people are basically human jellyfish. Big tech is full of them now.

No thought enters their mind, but they dodge the layoffs and the PIPs and get promoted like this.

I don’t fucking get it.

GreenBeard@lemmy.ca · 2 days ago

It’s just the natural progression of a disease that spreads outwards from Management. The bosses want yes-men, not people capable of independent thought.

SkyeStarfall@lemmy.blahaj.zone · 2 days ago

In other words, it’s why authoritarianism always fail

And capitalism is very specifically not a democratic economic system. There’s a hierarchy. The owners are the ones in power

criss_cross@lemmy.world · 2 days ago

If I was a director of AI safety I wouldn’t let openclaw within 100feet of anything. Let alone my work machine.

LiveLM@lemmy.zip · 2 days ago

If the Director of AI Safety is plugging code with extensive security flaws documented and reported into their real life inbox, imagine the Average Joe.

Wispy2891@lemmy.world · 2 days ago

Especially your work mailbox, that is a prime target for hackers and scammers, where a hidden prompt for prompt injection isn’t that impossibile.

This IMHO is a fireable offense, not a funny anecdote

sp3ctr4l@lemmy.dbzer0.com · 2 days ago

Yep.

These people are all fucking complete clowns.

It would be one thing if they were just evil, but they have such an inflated view of themselves that they have no self awareness.

Fucking corpos man.

Strider@lemmy.world · 2 days ago

Which is par for the course on current ‘AI’.

violentfart@lemmy.world · 2 days ago

They wanted to “eat their own dog food” but it’s closer to “eating their own dog shit”

Zwuzelmaus@feddit.org · 2 days ago

If I was the director of AI safety, […] would never tell a soul.

As a director of something, you are kinda public person. No way to just not tell.

panda_abyss@lemmy.ca · 2 days ago

Okay but this is like the armoury master person shooting their own foot with a loaded gun when they were juggling guns.

AbidanYre@lemmy.world · 2 days ago

Lee Paige has entered the chat

panda_abyss@lemmy.ca · 2 days ago

Remarkably well composed after shooting himself

Zwuzelmaus@feddit.org · 2 days ago

Then the public wants to know where that hole in the director’s foot comes from.

CmdrShepard49@sh.itjust.works · 2 days ago

How would the public find out that this woman’s email inbox got deleted though?

Zwuzelmaus@feddit.org · 2 days ago

Admins exist, and they talk.

FireWire400@lemmy.world · 2 days ago

Jokes on you; she probably still earns more money than most of us…

pinball_wizard@lemmy.zip · 2 days ago

And has fewer worthless emails in her inbox.

FireWire400@lemmy.world · edit-2 2 days ago

Probably mostly invites to boring meetings where she’s “optional”

MoogleMaestro@lemmy.zip · 2 days ago

The world’s first opt-in computer worm. 🐛 🪱

alekwithak@lemmy.world · 2 days ago

MoogleMaestro@lemmy.zip · 2 days ago

No way, not my buddy!

ZeDoTelhado@lemmy.world · 2 days ago

At least bonzie was funny, unlike openclaw

Fizz@lemmy.nz · 2 days ago

The funniest part is this person job is AI safety.

Echo Dot@feddit.uk · 2 days ago

It’s Meta, her experience is probably an MBA and she did a side course in “computing” where they learnt how to use Excel.

Chulk@lemmy.ml · 2 days ago

Yeah, I personally wouldn’t be announcing this failure to the world if I were in her position. I don’t think you could torture it out of me lmao

CmdrShepard49@sh.itjust.works · 2 days ago

Maybe they want to get this out there as cover if/when some regulator somewhere decides to subpoena records from the AI safety director.

KokoSabreScruffy@lemmy.world · 2 days ago

Maybe they are meant to protect the AI

Matty_r@programming.dev · 2 days ago

Maybe they’ll take their job more seriously now?

NotASharkInAManSuit@lemmy.world · 2 days ago

Thanks, I needed a laugh.

renzhexiangjiao@piefed.blahaj.zone · 2 days ago

you can like… enforce this rule programatically? you don’t have to say “pretty please” to ai? basically, when AI requests some potentially unwanted thing (like deleting an email), this request goes through a proxy that asks the human for confirmation. Also you can have a safe word set up in the chat interface to act as a killswitch. I thought these are ABCs of ai safety but apparently these are foreign concepts to this “safety director”

zqps@sh.itjust.works · edit-2 13 hours ago

The people who internalize this would never engage with a chatbot in this way in the first place. To them this is another intelligence they’re conversing with, where you get what you need by following social decorum, and enforcing your will amounts to abuse.

sp3ctr4l@lemmy.dbzer0.com · 1 day ago

Exactly.

They literally, fundamentally, don’t get it.

They think its a person.

Its not.

Its a simulation of a person, made of code and hardware, not meat and chemical receptors.

…There’s a reucrring theme (or maybe its more like a chatacter achetype) in a lot of analog horror series, things that are … almost, sort of human, sometimes, but they’re actually not.

They’re capable of great violence and terror, and they only mimic (often very poorly) human qualities and attributes, some of the time.

Uncanny valley itself, given form and capability.

… Do I need to explicitly lay out the parallels here, for any AI Safety Engineers in the audience?

At this point I’m going to say that watching The Second Renaissance from the AniMatrix needs to mandatory, required, monthly training for anyone developing ‘AI.’

RoyaltyInTraining@lemmy.world · 2 days ago

OpenClaw’s whole thing is that you give it unrestricted access to your Computer and online accounts. It’s made for people who do not want to think about safety.

HobbitFoot @thelemmy.club · 2 days ago

Program? Like a fucking farmer?

underscores@lemmy.zip · edit-2 2 days ago

The people that design AI tools don’t implement guardrails because then they’d have to admit AI is not ready for the shit they’re trying to make

rumba@lemmy.zip · 1 day ago

AI will never be ready. Humans aren’t ready either. That’s why IT staff uses guardrails for users :)

BadlyDrawnRhino @aussie.zone · 2 days ago

You say that, but who do you think the AIs will go after first if they ever do develop actual intelligence? In that scenario, simple manners can go a long way!

yogurtwrong@lemmy.world · 2 days ago

I hate how Apple users feel the need to call their computer by the brand. It really makes me cringe.

It is called “a computer”

Maybe “PC”

“box” if you really have to flex that UNIX

They should treat their computers less like a sports car and more like a van

Art3mis@lemmy.world · 2 days ago

I mean, isnt that the entire point of Apple? Brand recognition and percieved status attributed to said brand. Its like rappers and gucci belts or country artists and ford pickups

AlphaOmega@lemmy.world · 2 days ago

Every time someone organically refers to their computer as an Apple or Mac, an Apple marketing executive creams their pants.

sp3ctr4l@lemmy.dbzer0.com · 2 days ago

Branding and marketing is just building a cult these days.

Art3mis@lemmy.world · 2 days ago

…thats kind of how branding has always been under capitalism to a certain extent. Get people to think your brand is the best so they buy more instead of whatever is convenient. It has definitely gotten more extreme but i think that has more to do with the applications of what we are talking about.

Cell phones are embedded into nearly every aspect of our lives. So the brand symbolism carries that weight for people too.

Previously, brands like cocacola still had a death grip on society but it was one specific sector. So while it created a sort of cult vibe, it was definitely different.

sp3ctr4l@lemmy.dbzer0.com · 2 days ago

I get what you are saying and generally agree, but!

It actually was not always the way it is now.

Play RDR2.

Look at the advertisements for things, actually read them.

They’re actually pretty accurate to the advertisements of the time.

They are extremely based on ‘facts’, convicing the prospective buyer that the product is the best product, is very useful, can do this, is unique in this way.

Of course, sometimes the ‘facts’ are lies… but the general idea is not to sell a … emotion, or personality, or element of identity, or sense of belonging.

Its almost always to convince the buyer that this product is useful to them, and is priced reasonably for what it can do.

The turning point away from this was mostly or largely due to Edward Bernaise, the nephew of Sigmund Freud.

More or less, he applied Freud’s ideas and some of his own, some of others, to marketing.

His first big hit was angling Cigarettes as ‘Torches of Freedom’ to suffragettes.

At that point in time, smoking tobacco was generally seen as disgusting and low class for women, but not for men.

So, he was basically the first guy that went around and paid people to smoke cigarettes, while being trendy, with pre-designed slogans.

… It worked.

Because he was selling identity, not products, and this is much more effective.

Prior to that… brands basically were just built on the reputation of their products.

Now… now its so insane that for many say, video games and movies… far more time of the entire experience of the product is the hype train, the controversy, the twitter wars… prior to the product even coming out.

And then, its often just a flash in the pan.

But… you will still have dedicated fans, ongoing internet arguments, for literal years, even decades, since the last time anyone involved actually viewed or played the product.

Thats all designed for, to maximize the chances of that happening.

Marketing literally is applied psychology.

furry toaster@lemmy.blahaj.zone · 2 days ago

yes the point of apple prodcuts is to waste money and shove it at everyone’s faces

Echo Dot@feddit.uk · 2 days ago

In slight fairness to them the Mac mini isn’t actually pretty decent PC, unlike their laptops which are absolutely not worth the money. Although maybe these days $400 for 16 gigabytes of RAM is actually market value.

balsoft@lemmy.ml · 1 day ago

Yes, fully agreed. What dummies!

– Sent from my ThinkPad

yogurtwrong@lemmy.world · 1 day ago

IT’S DIFFERENT M’KAY

Rai@lemmy.dbzer0.com · 2 days ago

Ehhhh as an owner of five or six windows computers, four Linux machines, and a couple Apple computers, I always specify which machine I’m referring to if I’m talking about something I did/something that happened on one of them in case it could be pertinent.

mrgoosmoos@lemmy.ca · 2 days ago

yeah I sat there for a few seconds trying to figure out the relevance

turns out, it wasn’t relevant

instant loss of attention and judging of their character

LiveLM@lemmy.zip · 2 days ago

She’s lucky all she got were some deleted emails.
Given how insecure this whole ordeal is and the fact that she gave it full access to her REAL Inbox, someone could have phished the ever living fuck out of her and Meta just by sending an email with malicious prompt written on white text or hiding messages zero-width characters and other wacky antics.
Real Looney Tunes shit, congratulations to all involved.

Echo Dot@feddit.uk · 2 days ago

You wouldn’t even need to hide it since apparently she wasn’t paying attention.

xep@discuss.online · 2 days ago

This smells like guerilla marketing to me.

TBi@lemmy.world · 2 days ago

Yeah. Like they are trying to show the AI is more powerful than it is.

I don’t use AI that much, does this use case actually happen? Where the AI does something then apologises?

xep@discuss.online · 2 days ago

LLMs will often respond in a reconciliatory or obsequious manner when presented with confrontational input.

'I had to RUN to my Mac mini like I was defusing a bomb': OpenClaw AI chose to 'speedrun' deleting Meta AI safety director's inbox due to a 'rookie error'

'I had to RUN to my Mac mini like I was defusing a bomb': OpenClaw AI chose to 'speedrun' deleting Meta AI safety director's inbox due to a 'rookie error'

Meta AI safety director watched OpenClaw AI 'speedrun' deleting her inbox