Anthropic claims Claude 3 better than competition Anthropic claims Claude 3 better than competition

Anthropic, in case you have not heard of it, is a company established by several ex-OpenAI engineers. At a certain point, they saw things differently than their employer, and decided to build a separate AI company. They’ve been quite successful so far, considering more modest investments ($7.3 billion in 2023, compared to $11.3 from various funds and $13 billion from Microsoft) and lack of awareness-boosting scandals (Altman’s fired-hired maneuvers). Recently, Anthropic released the next generation of its Claude LLMs, and stated that at least one of them beats the competition.

Anthropic’s Claude 3 AI family

In a move that looks like a trend-setting marketing gimmick, Anthropic calls the set of large language models it has released a family. The siblings are, smartest to fastest, Opus, Sonnet, and Haiku. 

All three models, according to the developer, can handle complex requests and give near-instant answers. In the post of March 4, 2024, Anthropic sells the family as a perfect solution for a customer service operation, pointing that Haiku is the most cost-effective choice (yet somewhat lacking in more complicated situations involving multi-step instructions), and Opus the king of the hill capable of delivering “similar speeds to Claude 2 and 2.1, but with much higher levels of intelligence.”

As expected, the developer put its creations through tests to see how they turned out, and, pleasantly surprised, discovered that Claude 3 models work better than competing AIs, as shown by their evaluation benchmark scores. All siblings, according to Anthropic, exhibit “near-human levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.”

Claude 3 benchmarking scorecard. Image from AnthropicClaude 3 benchmarking scorecard. Image from Anthropic

Anthropic compared its products to OpenAI’s GPT-4 and GTP-3.5, and Google’s Gemini 1.0 Ultra and Gemini 1.0 Pro. The scorecard published by the company shows that Opus, the smartest of the Claude 3 family, outperforms competition in most disciplines.

AI race: what’s next?

The growing power of large language models isn’t something unexpected. The rate of technological development today is unparalleled, and it’s only gaining momentum, not slowing down. What’s more interesting is that the developer suggests a rather specific use case for its product instead of marketing it as a Swiss army knife (although you won’t find this sort of precision on its webpage). Thus, we may be looking at a trend of purpose-designed AIs, or, possibly, forks of all-purpose LLMs, that would eventually cover all fields they may sanely be applied in, from all tasks involving answer-finding through creative authoring and recognition to homework checking and tutoring.

Author's other posts

Hardware spotlight: iPad 10 versus iPad Air (2024)
Article
Hardware spotlight: iPad 10 versus iPad Air (2024)
The price difference between iPad 10 and iPad Air 6th generation is only $250 now. Let's see if it's a wise investment.
Google simplifies password sharing within family groups
Article
Google simplifies password sharing within family groups
Google makes password sharing easy, simple, and, more importantly, secure with its new feature.
Recall, an AI search tool that records everything you do
Article
Recall, an AI search tool that records everything you do
Microsoft has announced a line of Copilot+ PCs: powered by ARM chipsets, heavy on AI, and with Recall inside. So, what's that?
OpenAI releases a native desktop app for MacOS
Article
OpenAI releases a native desktop app for MacOS
Yup, ChatGPT at your fingertips, summoned with a keyboard shortcut and delivering as expected.