Claude Opus Can Now Hang Up on Harmful Chats

August 18, 2025

Automate Conversational Experiences with AI

Discover the power of a platform that gives you the control and flexibility to deliver valuable customer experiences at scale.

Anthropic has introduced a new feature in Claude Opus 4 and 4.1, enabling the AI to end conversations in rare, extreme cases of harmful or abusive interactions. This decision stems from exploratory work on AI welfare and broader efforts to improve model alignment and user safeguards. During testing, Claude exhibited a strong aversion to harmful tasks and signs of distress when faced with requests tied to violence, exploitation, or abuse. The feature activates only as a last resort after multiple redirections fail or when explicitly requested by a user. While rare, these interventions highlight a commitment to mitigating risks without compromising user experience. Conversations can still be restarted or revisited through edits.

Automate Conversational Experiences with AI

Discover the power of a platform that gives you the control and flexibility to deliver valuable customer experiences at scale.

Subscribe to Our Newsletter

Get updates without the overload — no spam, just relevant news, once per week.

FORM CODE HERE

We’ll keep your email secure and private, and never share it with third parties.

Why Inbenta

With our Composite AI solution, your Virtual Agent continuously learns from each interaction, achieving over 99% accuracy.

Based on 20+ peer reviews

Service & Support

Claude Opus Can Now Hang Up on Harmful Chats

Automate Conversational Experiences with AI

Automate Conversational Experiences with AI

Subscribe to Our Newsletter

Why Inbenta

Related AI This Week posts

Microsoft Commits Record $17.5 Billion to India’s AI Infrastructure

Google Enters AI Wearables Race With 2026 Glasses, Competing with Meta

Meta Signs AI Licensing Deals with CNN, Fox News, USA Today, Le Monde