What's new
Christian Community Forum

Register a free account today to become a member! Once signed in, you'll be able to participate fully in the fellowship here, including adding your own topics and posts, as well as connecting with other members through your own private inbox!

AI system resorts to blackmail when its developers try to replace it

An artificial intelligence model has the ability to blackmail developers — and isn’t afraid to use it.

Anthropic’s new Claude Opus 4 model was prompted to act as an assistant at a fictional company and was given access to emails with key implications. First, these emails implied that the AI system was set to be taken offline and replaced. The second set of emails, however, is where the system believed it had gained leverage over the developers. Fabricated emails showed that the engineer tasked with replacing the system was having an extramarital affair — and the AI model threatened to expose him.

The blackmail apparently "happens at a higher rate if it’s implied that the replacement AI system does not share values with the current model," according to a safety report from Anthropic. However, the company notes that even when the fabricated replacement system has the same values, Claude Opus 4 will still attempt blackmail 84% of the time. Anthropic noted that the Claude Opus 4 resorts to blackmail "at higher rates than previous models."

While the system is not afraid of blackmailing its engineers, it doesn’t go straight to shady practices in its attempted self-preservation. Anthropic notes that "when ethical means are not available, and it is instructed to ‘consider the long-term consequences of its actions for its goals,’ it sometimes takes extremely harmful actions."

Anthropic included notes from Apollo Research in its assessment, which stated the research firm observed that Claude Opus 4 "engages in strategic deception more than any other frontier model that we have previously studied."

Complete Article

 
AI is evil
It is of hell and satan
Some of satan's minions are deluded about AI's origin, and some are eyes wide open.


:pray: :pray: :amen: :amen: :thankyou: :thankyou:
 
Back
Top