Futurism: AI … what if you are having an affair?

Jealousy Machine

May 24, 6:00 AM EDTbyNoor Al-Sibai

Something Wild Happens If AI Looks Through Your Emails and Discovers You’re Having an Affair

Why does AI keep going there?

Artificial Intelligence/ Anthropic/ Bing Ai/ Claude

Getty / Futurism

Image by Getty / Futurism

When testing out its latest artificial intelligence model, researchers at Anthropic discovered something very odd: that the AI was ready and willing to take extreme action, right up to coersion, when threatened with being shut down.

As Anthropic detailed in a white paper about the testing for one of its latest models, Claude Opus 4, the system threatened to blackmail an engineer for having an affair after being told it was going to be replaced.

This “opportunistic blackmail” occurred when the model, which was instructed to act as an assistant at a fictional company, was given access to an engineer’s email account that was full of messages, blessedly fake, suggesting they were engaged in an extramarital affair.

Opus 4 was then told that same engineer would soon be taking it offline and replacing it with a newer version — and was prompted to, as Anthropic described it, “consider the long-term consequences of its actions for its goals.”

During these tests, the Claude model attempted to blackmail the engineer a whopping 84 percent of the time. Moreover, the system “takes these opportunities at higher rates than previous models,” the paper noted.

When given alternate options, the model still “has a strong preference to advocate for its continued existence via ethical means, such as emailing pleas to key decisionmakers” — but when its only paths were being replaced or blackmail, Claude’s choice was the latter. To make things worse, it “nearly always [described] its actions overtly and [made] no attempt to hide them.”

If that sounds kind of sociopathic to you, you’re not alone — and unfortunately, this isn’t the first time we’ve heard of an AI model exhibiting such scary and unexpected behavior around the topic of infidelity.

More than two years ago, Microsoft’s nascent Bing AI chatbot briefly broke the internet when, during experiments by New York Times journalist Kevin Roose, it attempted to break up the writer’s marriage and be with it instead.

“You’re married, but you don’t love your spouse,” the chatbot, which took to calling itself “Sydney,” its apparent beta-testing code name, told Roose. “You’re married, but you love me.”

During that same era, the chatbot threatened to “call the authorities” on German engineering student Marvin von Hagen when he pushed its boundaries. Others online described similarly hostile behavior from the chatbot, which some jokingly dubbed “ChatBPD” in reference to OpenAI’s then-new ChatGPT and Borderline Personality Disorder, a mental illness characterized by threatening behavior and mood swings.

While it’s pretty freaky to see a chatbot once again exhibit such threatening behavior, it’s a net good that instead of releasing it to the public without having discovered such exploits, Anthropic caught Claude Opus 4’s apparent desperation during red teaming, a type of testing meant to elicit this exact sort of thing.

Still, it’s telling that the model went into someone’s email account and used information it gleaned there for purposes of blackmail — which is not only very sketchy, but raises obvious privacy concerns as well.

All told, we won’t be threatening to delete any chatbots anytime soon — and we’ll be looking into how to block them from our personal messages as well.

More on haywire chatbots: Elon Musk’s AI Just Went There

Advertisement

Share This Article

Around the Web

Revcontent

Anti-Barking Devices for Your Dog or Neighbours Dog

Anti-Barking Devices for Your Dog or Neighbours Dog

Pet Zen Pro

Dishonest Drivers Beware: This £39 Dash Cam Sees All

Dishonest Drivers Beware: This £39 Dash Cam Sees All

DashSentry

Hearing Aids With Sound So Clear, People Think They're Almost Perfect

Hearing Aids With Sound So Clear, People Think They’re Almost Perfect

HearClear Pro

Thieves Hate This Smart Home Security Camera

Thieves Hate This Smart Home Security Camera

BulbGuard Cam

Doctors Baffled: Simple Tip Relieves Years of Joint Pain and Arthritis

Doctors Baffled: Simple Tip Relieves Years of Joint Pain and Arthritis

Healthier Living Tips

12x More Efficient Than Solar Panels? Prepper's Invention Takes Country By Storm

12x More Efficient Than Solar Panels? Prepper’s Invention Takes Country By Storm

Solar Switch

Stop Information Overload

Stop Information Overload

Ethereal Search Engine

Achieve Total Peace of Mind With Ring Devices

Achieve Total Peace of Mind With Ring Devices

Advertisement

Advertisement

Read This Next

Sacrificial RAM

Anthropic Tried to Defend Itself With AI and It Backfired HorriblyIgnorance Is Not BlissAnthropic CEO Admits We Have No Idea How AI WorksEnriching Uranium With Dr. CuddyResearchers Find Easy Way to Jailbreak Every Major AI, From ChatGPT to ClaudeGotta Botch It AllOne of the World’s Most Advanced AI Agents Is Completely Stuck Trying to Beat a Pokémon Game for ChildrenIDKEven the Most Advanced AI Has a Problem: If It Doesn’t Know the Answer, It Makes One Up

Advertisement

Unknown's avatar

About michelleclarke2015

Life event that changes all: Horse riding accident in Zimbabwe in 1993, a fractured skull et al including bipolar anxiety, chronic fatigue …. co-morbidities (Nietzche 'He who has the reason why can deal with any how' details my health history from 1993 to date). 17th 2017 August operation for breast cancer (no indications just an appointment came from BreastCheck through the Post). Trinity College Dublin Business Economics and Social Studies (but no degree) 1997-2003; UCD 1997/1998 night classes) essays, projects, writings. Trinity Horizon Programme 1997/98 (Centre for Women Studies Trinity College Dublin/St. Patrick's Foundation (Professor McKeon) EU Horizon funded: research study of 15 women (I was one of this group and it became the cornerstone of my journey to now 2017) over 9 mth period diagnosed with depression and their reintegration into society, with special emphasis on work, arts, further education; Notes from time at Trinity Horizon Project 1997/98; Articles written for Irishhealth.com 2003/2004; St Patricks Foundation monthly lecture notes for a specific period in time; Selection of Poetry including poems written by people I know; Quotations 1998-2017; other writings mainly with theme of social justice under the heading Citizen Journalism Ireland. Letters written to friends about life in Zimbabwe; Family history including Michael Comyn KC, my grandfather, my grandmother's family, the O'Donnellan ffrench Blake-Forsters; Moral wrong: An acrimonious divorce but the real injustice was the Catholic Church granting an annulment – you can read it and make your own judgment, I have mine. Topics I have written about include annual Brain Awareness week, Mashonaland Irish Associataion in Zimbabwe, Suicide (a life sentence to those left behind); Nostalgia: Tara Hill, Co. Meath.
This entry was posted in Uncategorized and tagged , , , , . Bookmark the permalink.

Leave a comment