Anthropic Claude 3.5 Sonnet vs. OpenAI GPT-4o: Which Is Better?

Anthropic's new Claude A.I. model bests GPT-4o across numerous benchmarks, according to the startup.

Man and woman pose behind light orange wall
Anthropic was founded by Dario and Daniela Amodei in 2021. Courtesy Anthropic.

Anthropic, an A.I. startup founded by former OpenAI engineers yesterday (June 20) released Claude 3.5 Sonnet, its most powerful A.I. model yet. The new model is not only twice the speed of its predecessor, Claude 3 Opus, released just three months ago, but surpasses OpenAI’s GPT-4o across numerous reasoning, coding and visual comprehension measurements, according to the company. “With today’s launch, we’re taking a step towards what we believe could be a significant shift in how we interact with technology,” said Anthropic CEO and co-founder Dario Amodei in a statement.

Sign Up For Our Daily Newsletter

By clicking submit, you agree to our <a href="http://observermedia.com/terms">terms of service</a> and acknowledge we may use your information to send you emails, product samples, and promotions on this website and other properties. You can opt out anytime.

See all of our newsletters

Anthropic has positioned itself as one of OpenAI’s primary rivals. It was founded in 2021 by Dario Amodei and his sister, Daniela. Both previously worked at OpenAI, respectively overseeing research and its safety and policy policy initiatives, and left the company in 2020 over concerns regarding its direction and lack of safeguards.

Dario Amodei suggested rapid model releases in the near future. The release of Claude 3.5 Sonnet will soon be followed by new releases in the Claude family. “Our aim is to substantially improve the tradeoff curve between intelligence, speed and cost, and we plan to release Claude 3.5 Haiku and Claude 3.6 Opus later this year while also pursuing our safety research to ensure these systems remain safe,” the CEO said. Anthropic is also exploring memory-focused capabilities that will further personalize models to remember specific user preferences and interaction features.

The San Francisco-based startup currently boasts around 375 employees, compared to OpenAI’s headcount of. approximately 2,000. Anthropic’s flurry of releases indicate it is attempting to keep up in a fast-paced A.I. arms race with OpenAI, which debuted GPT-4o in May. Here’s how the two company’s newest models stack up against each other:

Claude comes out on top for reading, coding and math

In addition to showcasing improvements in humor, nuance and writing in a natural and relatable voice, Anthropic said its newest model surpasses GPT-4o across benchmarks in reasoning, knowledge and coding proficiency.

Claude 3.5 Sonnet slightly outperforms GPT-4o in graduate-level reasoning, code, multilingual math and reasoning over text, according to the startup. GPT-4o, meanwhile, displays higher skills in math problem-solving.

Despite Claude’s impressive results, A.I. model benchmarks shouldn’t be taken too seriously as a measurement of capabilities due to skepticism regarding their narrow focus and inability to convey how average individuals interact with models.

Outperforming GPT-4o as a visual model

Another series of benchmarks showcase Claude’s improvements across visual comprehension. Anthropic said its new model surpasses GPT-4o when it comes to visually understanding math, science diagrams, charts and documents. These features are of particular importance for retail, logistics and financial services, which are oftentimes able to “glean more insights from an image, graphic or illustration than from text alone,” according to the company.

Anthropic set to integrate A.I. into the workplace

Anthropic’s new model will additionally debut a feature known as Artifacts that sets it apart from competing models. It will create an integrated workspace allowing users to directly edit and interact with content, such as emails, code or documents, generated by Claude. The new feature represents Anthropic’s desire to serve businesses by transforming Claude from a “conversational A.I.” to a “collaborative work environment.”

Both models are available at no cost

The web and app version of Claude 3.5 Sonnet will be available at no cost. Meanwhile, Claude Pro and Team plan subscribers will be able to access the model with higher rate limits. This move follows a standard set by OpenAI, which launched GPT-4o earlier this year for free and with greater capabilities for paying users.

Prioritizing safety protocols

Claude was subjected to rigorous safety tests, according to Anthropic, which provided the model to the U.K.’s Artificial Intelligence Safety Institute for pre-deployment safety evaluations. OpenAI, meanwhile, has come under fire in recent months from former employees claiming the company is not prioritizing safety protocols. Jan Leike, who previously co-ran a safety team at OpenAI that has since been dissolved, left the company in May and has since joined Anthropic.

“Creating systems that are not only capable but also reliable, safe and aligned with human values is a complex challenge,” said Dario Amodei. “We don’t have all the answers, but we’re dedicated to working on these problems thoughtfully and responsibly”

Anthropic Claude 3.5 Sonnet vs. OpenAI GPT-4o: Which Is Better?