Artificial Intelligence is no doubt the future of not just software development but the whole world. And I’m on a mission to master it – focusing first on mastering AI Agents.
In this video, I created an advanced AI agent with the sole purpose of using it to push the most widely used LLMs to their limits until they broke. I show you how I created this agent and then I use it to break a bunch of LLMs! One of LLMs actually survived my testing, while all the others failed…
The code for this agent can be found here:
https://github.com/coleam00/ai-agents-masterclass/tree/main/llm-agent-evaluation-framework
00:00 – 01:34 – Plan of Attack
01:35 – 10:19 – AI Agent Code Walkthrough
10:20 – 19:02 – Breaking GPT-4o Mini
19:03 – 23:09 – Breaking Claude 3.5 Sonnet
23:10 – 28:21 – Breaking Groq’s Fine Tuned Llama3
28:22 – 31:42 – Breaking GPT-4o
31:43 – 32:39 – Outro
Join me as I push the limits of what is possible with AI. I’ll be uploading videos twice a week – Sundays and Wednesdays at 7:00 PM CDT!
source