Anthropic Uses Pokémon Red to Benchmark New AI Model

March 4, 2025
Automate Conversational Experiences with AI
Discover the power of a platform that gives you the control and flexibility to deliver valuable customer experiences at scale.
Schedule a demo


Anthropic has employed the classic Game Boy game Pokémon Red to test its latest AI model, Claude 3.7 Sonnet. Unlike its predecessor, Claude 3.0 Sonnet, which struggled to leave the starting area, the updated model successfully battled three gym leaders, demonstrating impressive progress. Equipped with basic memory, screen pixel input, and function calls, Claude 3.7 Sonnet leveraged “extended thinking” to perform 35,000 actions and achieve significant milestones. The company revealed that within hours, the AI defeated Brock and subsequently conquered Misty, showcasing its advanced problem-solving capabilities. Pokémon Red joins a range of games now used to assess AI performance

Read more

Why Inbenta

With our Composite AI solution, your Virtual Agent continuously learns from each interaction, achieving over 99% accuracy.
Learn more
Gartners Peer Insights Logo
Based on 20+ peer reviews
Service & Support

Related AI This Week posts

Fitbit Founders Return with AI Family Health App
Read more
Claude Sonnet 5 Leak Sparks Super Bowl Week Buzz
Read more
Musk Folds xAI Into SpaceX in Private Mega Deal
Read more