Anthropic claims Claude 3 better than competition Anthropic claims Claude 3 better than competition

Anthropic, in case you have not heard of it, is a company established by several ex-OpenAI engineers. At a certain point, they saw things differently than their employer, and decided to build a separate AI company. They’ve been quite successful so far, considering more modest investments ($7.3 billion in 2023, compared to $11.3 from various funds and $13 billion from Microsoft) and lack of awareness-boosting scandals (Altman’s fired-hired maneuvers). Recently, Anthropic released the next generation of its Claude LLMs, and stated that at least one of them beats the competition.

Anthropic’s Claude 3 AI family

In a move that looks like a trend-setting marketing gimmick, Anthropic calls the set of large language models it has released a family. The siblings are, smartest to fastest, Opus, Sonnet, and Haiku. 

All three models, according to the developer, can handle complex requests and give near-instant answers. In the post of March 4, 2024, Anthropic sells the family as a perfect solution for a customer service operation, pointing that Haiku is the most cost-effective choice (yet somewhat lacking in more complicated situations involving multi-step instructions), and Opus the king of the hill capable of delivering “similar speeds to Claude 2 and 2.1, but with much higher levels of intelligence.”

As expected, the developer put its creations through tests to see how they turned out, and, pleasantly surprised, discovered that Claude 3 models work better than competing AIs, as shown by their evaluation benchmark scores. All siblings, according to Anthropic, exhibit “near-human levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.”

Claude 3 benchmarking scorecard. Image from AnthropicClaude 3 benchmarking scorecard. Image from Anthropic

Anthropic compared its products to OpenAI’s GPT-4 and GTP-3.5, and Google’s Gemini 1.0 Ultra and Gemini 1.0 Pro. The scorecard published by the company shows that Opus, the smartest of the Claude 3 family, outperforms competition in most disciplines.

AI race: what’s next?

The growing power of large language models isn’t something unexpected. The rate of technological development today is unparalleled, and it’s only gaining momentum, not slowing down. What’s more interesting is that the developer suggests a rather specific use case for its product instead of marketing it as a Swiss army knife (although you won’t find this sort of precision on its webpage). Thus, we may be looking at a trend of purpose-designed AIs, or, possibly, forks of all-purpose LLMs, that would eventually cover all fields they may sanely be applied in, from all tasks involving answer-finding through creative authoring and recognition to homework checking and tutoring.

Author's other posts

How to disable automatic backups to OneDrive?
Article
How to disable automatic backups to OneDrive?
Microsoft decided that it doesn't have to ask for your consent before switching OneDrive backing up on when you install Windows 11. Let's see how to disable it.
Apple Watch mismeasuring vitals? Get rid of the ink
Article
Apple Watch mismeasuring vitals? Get rid of the ink
Apple Watch, one of the best wearables available, cannot get through your skin ink.
Optional Windows 11 update can crash your PC useless
Article
Optional Windows 11 update can crash your PC useless
Not all updates are created equal. Some may break your PC and force you to run full system recovery.
Apple Intelligence: what's known so far (June 2024)
Article
Apple Intelligence: what's known so far (June 2024)
Apple bets big on its Apple Intelligence. Let's take a closer look.