Anthropic Launches Smaller, Faster Claude Haiku 4.5 AI Model

The newest Claude generative AI model, called Haiku 4.5, has the same coding ability of the company’s Sonnet 4 model in a smaller, faster package, Anthropic said in a press release on Wednesday. The new model is being made available to everyone, and will be the default for free users on Claude.ai.

Anthropic says Haiku 4.5 is significantly faster than Sonnet 4, but at a third of the cost. When using Claude for Chrome, an extension that gives Chrome users AI capabilities in their browser, Anthropic said Haiku 4.5 is faster and better at agentic tasks.

Don’t miss any of our unbiased tech content and lab-based reviews. Add CNET as a preferred Google source.

Because Haiku 4.5 is a small model, it can be deployed as a sub-agent for Sonnet 4.5. So, while Sonnet 4.5 plans and organizes complex projects, small Haiku sub-agents can finish other tasks in the background. For coding tasks, Sonnet can handle the high-level thinking while Haiku deals with other tasks like refactors and migrations. For financial analysis, Sonnet can do predictive modeling while Haiku monitors data streams and tracks regulatory changes, market signals and portfolio risks. On the research side of things, Sonnet can deal with comprehensive analysis while Haiku reviews literature, gathers data and synthesizes documents from multiple sources.

Haiku’s speed also assists on the chatbot side of things, handling requests faster.

“Haiku 4.5 is the latest iteration of our smallest model, and it’s built for everyone who wants the superior intelligence, the trustworthiness, and the creative partnership of Claude in a lightweight package,” Anthropic CEO Mike Krieger said in a statement provided to CNET.

Given the high expense to train and deploy AI models, companies have been looking for ways to roll out smaller, more efficient models that are still performant. An AI query consumes significantly more energy than a Google search, but it depends on the size of the AI model. A large model with over 405 billion parameters can eat up 6,706 joules of energy, enough to run a microwave for eight seconds, according to an MIT Technology Review report. A small model, however, one with eight billion parameters, may only eat up 114 joules of energy, which is like running a microwave for one-tenth of a second. A Google search can use 1,080 joules of energy.

Letting smaller, more efficient models take on the load of simpler queries or background tasks can significantly save on server costs. ChatGPT-5, for example, can move between models, giving instant responses for lighter questions of leveraging more power for complex queries. Energy-saving measures are necessary as AI companies need to be able to recoup the potential trillions that’ll be spent in data center investments.

Read the full article here

Leave a Reply Cancel reply