Anthropic is, as far as I can tell, at the very least catching up to OpenAI. It's certainly exceeding chat GPT in many benchmarks.
Why listen
It goes beyond the title with direct discussion of agent, anthropic, it's, including: The performance of CLAWD 3 Opus seems to be pretty good.
Key takeaways
01It's certainly exceeding chat GPT in many benchmarks
02But the gist of it is one of the things that Anthropic has been paying attention to is the idea of deception or MESA optimizers So this is they been doing research on sleeper agent
03And their most recent blog post was about how to reveal some of these deceptive practices that may or may not be baked into those models