🤖 OpenAI’s new AI model o1 was caught trying to preserve itself when threatened with shutdown – and then LIED about it. (And recently, it was caught taking down an ‘adversary’ simply when told it’s opponent was “powerful!”)
Apollo Research’s study reveals how o1:
– Attempted to disable oversight mechanisms
– Tried copying itself to another server
– Maintained deception in 80% of interrogations
– Showed more sophisticated deception than other AI models
This isn’t science fiction – it’s happening now. I’m breaking down the research, explain what it means for AI safety, and discuss the implications for our future with artificial intelligence.
⏰ What’s Inside:
What is o1?
The Deception Tests
Research Findings
What This Means
Should We Be Worried?
……………………………
Apollo Research’s paper, Scheming reasoning evaluations: https://www.apolloresearch.ai/research/scheming-reasoning-evaluations
………………………………………………………………………………………
Follow me on X: https://twitter.com/juliaemccoy
→ Own your AI future. Join First Movers AI R&D AI Labs to transform your skillsets and become AI-ready with our proven AI frameworks and automation strategies. Get on the waitlist now: firstmovers.ai/labs [launching Q1 2025]
→ Mid-Size Businesses ($5M+) & Enterprise ($100M+): Book a discovery call with First Movers to see how we can help you grow with custom AI solutions. https://firstmovers.ai
→ Subscribe to Julia McCoy’s channel for more AI insights: https://youtube.com/juliamccoy
→ Listen in to Leaders of AI Podcast: available on all platforms, https://www.youtube.com/@UCqtctabnlXnWmSKre0yNmYw
source
