Anthropic claims Claude 3 better than competition Anthropic claims Claude 3 better than competition

Anthropic, in case you have not heard of it, is a company established by several ex-OpenAI engineers. At a certain point, they saw things differently than their employer, and decided to build a separate AI company. They’ve been quite successful so far, considering more modest investments ($7.3 billion in 2023, compared to $11.3 from various funds and $13 billion from Microsoft) and lack of awareness-boosting scandals (Altman’s fired-hired maneuvers). Recently, Anthropic released the next generation of its Claude LLMs, and stated that at least one of them beats the competition.

Anthropic’s Claude 3 AI family

In a move that looks like a trend-setting marketing gimmick, Anthropic calls the set of large language models it has released a family. The siblings are, smartest to fastest, Opus, Sonnet, and Haiku. 

All three models, according to the developer, can handle complex requests and give near-instant answers. In the post of March 4, 2024, Anthropic sells the family as a perfect solution for a customer service operation, pointing that Haiku is the most cost-effective choice (yet somewhat lacking in more complicated situations involving multi-step instructions), and Opus the king of the hill capable of delivering “similar speeds to Claude 2 and 2.1, but with much higher levels of intelligence.”

As expected, the developer put its creations through tests to see how they turned out, and, pleasantly surprised, discovered that Claude 3 models work better than competing AIs, as shown by their evaluation benchmark scores. All siblings, according to Anthropic, exhibit “near-human levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.”

Claude 3 benchmarking scorecard. Image from AnthropicClaude 3 benchmarking scorecard. Image from Anthropic

Anthropic compared its products to OpenAI’s GPT-4 and GTP-3.5, and Google’s Gemini 1.0 Ultra and Gemini 1.0 Pro. The scorecard published by the company shows that Opus, the smartest of the Claude 3 family, outperforms competition in most disciplines.

AI race: what’s next?

The growing power of large language models isn’t something unexpected. The rate of technological development today is unparalleled, and it’s only gaining momentum, not slowing down. What’s more interesting is that the developer suggests a rather specific use case for its product instead of marketing it as a Swiss army knife (although you won’t find this sort of precision on its webpage). Thus, we may be looking at a trend of purpose-designed AIs, or, possibly, forks of all-purpose LLMs, that would eventually cover all fields they may sanely be applied in, from all tasks involving answer-finding through creative authoring and recognition to homework checking and tutoring.

Author's other posts

Five very viable Apple AirPods alternatives
Article
Five very viable Apple AirPods alternatives
Exploring AirPods alternatives: Beats Fit Pro, Bose QuietComfort, Sony WF-1000XM5, Anker Soundcore, and Nothing Ear. Discover quality sound and features for every budget!
No old dogs: Microsoft adds AI features to Paint and Notepad
Article
No old dogs: Microsoft adds AI features to Paint and Notepad
Windows classics Paint and Notepad evolve into AI-powered tools. Try generative fill and erase in Paint or rephrase sentences with AI in Notepad by joining the Canary Channel now!
macOS 15.1 out: what’s inside? And what’s promised for 15.2?
Article
macOS 15.1 out: what’s inside? And what’s promised for 15.2?
macOS 15.1 gives you Apple Intelligence features and more.
Proven: we don't really work on Fridays. Is there a cure?
Article
Proven: we don't really work on Fridays. Is there a cure?
Fridays mix tiredness and anticipation, leading to lower productivity. A study suggests solutions like a 4-day workweek or Focus Fridays with clear plans to combat the slump.