Anthropic claims Claude 3 better than competition Anthropic claims Claude 3 better than competition

Anthropic, in case you have not heard of it, is a company established by several ex-OpenAI engineers. At a certain point, they saw things differently than their employer, and decided to build a separate AI company. They’ve been quite successful so far, considering more modest investments ($7.3 billion in 2023, compared to $11.3 from various funds and $13 billion from Microsoft) and lack of awareness-boosting scandals (Altman’s fired-hired maneuvers). Recently, Anthropic released the next generation of its Claude LLMs, and stated that at least one of them beats the competition.

Anthropic’s Claude 3 AI family

In a move that looks like a trend-setting marketing gimmick, Anthropic calls the set of large language models it has released a family. The siblings are, smartest to fastest, Opus, Sonnet, and Haiku. 

All three models, according to the developer, can handle complex requests and give near-instant answers. In the post of March 4, 2024, Anthropic sells the family as a perfect solution for a customer service operation, pointing that Haiku is the most cost-effective choice (yet somewhat lacking in more complicated situations involving multi-step instructions), and Opus the king of the hill capable of delivering “similar speeds to Claude 2 and 2.1, but with much higher levels of intelligence.”

As expected, the developer put its creations through tests to see how they turned out, and, pleasantly surprised, discovered that Claude 3 models work better than competing AIs, as shown by their evaluation benchmark scores. All siblings, according to Anthropic, exhibit “near-human levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.”

Claude 3 benchmarking scorecard. Image from AnthropicClaude 3 benchmarking scorecard. Image from Anthropic

Anthropic compared its products to OpenAI’s GPT-4 and GTP-3.5, and Google’s Gemini 1.0 Ultra and Gemini 1.0 Pro. The scorecard published by the company shows that Opus, the smartest of the Claude 3 family, outperforms competition in most disciplines.

AI race: what’s next?

The growing power of large language models isn’t something unexpected. The rate of technological development today is unparalleled, and it’s only gaining momentum, not slowing down. What’s more interesting is that the developer suggests a rather specific use case for its product instead of marketing it as a Swiss army knife (although you won’t find this sort of precision on its webpage). Thus, we may be looking at a trend of purpose-designed AIs, or, possibly, forks of all-purpose LLMs, that would eventually cover all fields they may sanely be applied in, from all tasks involving answer-finding through creative authoring and recognition to homework checking and tutoring.

Author's other posts

Microsoft adds scareware detector to Edge; what about other browsers?
Article
Microsoft adds scareware detector to Edge; what about other browsers?
Edge's brand new AI-powered scareware detector blocks those scare-inducing pop-ups and keeps you safe. Other browsers offer assistance, too.
Apple plans to sell a cheaper MacBook: what is it going to be?
Article
Apple plans to sell a cheaper MacBook: what is it going to be?
Apple's affordable MacBook with a 6-core A18 Pro chip, 8GB RAM, and ~12.9" LCD display is set to launch in 2026. Targeting students, it may start at $599.
Windows 11 23H2 support ends in November; how to upgrade to 25H2?
Article
Windows 11 23H2 support ends in November; how to upgrade to 25H2?
Windows 11 23H2 will soon join Windows 10 in the list of no-longer-supported versions. Here is what you can do about it.
How to improve RAM performance on a Mac: regular and advanced tricks
Article
How to improve RAM performance on a Mac: regular and advanced tricks
Macs are cool. But they can get slow. Here are some efficient ways to free up RAM, boost the computer's performance, and keep it running well.