AT&T Slashes AI Costs 90% by Swapping Large Models for Small Ones

AT&T boosted the efficiency of its internal Ask AT&T personal assistant by reworking the orchestration layer and shifting more of the work from large language models (LLMs) to small language models (SLMs), VentureBeat reported Thursday (Feb. 26).

This change improved latency, speed and response times; cut costs by 90%; and enabled the system to process three times as many tokens, according to the report.

“I believe the future of agentic AI is many, many, many small language models,” AT&T Chief Data Officer Andy Markus said, per the report. “We find small language models to be just about as accurate, if not as accurate, as a large language model on a given domain area.”

A small language model is a scaled-down version of a large language model, PYMNTS reported in April. The SLM doesn’t have as many parameters, but users may not need the extra power depending on the task at hand.

In many cases, SLMs are faster, cheaper and offer more control, which is key for companies looking to deploy powerful AI into their operations without breaking the bank. SLMs can perform just as well or even outperform LLMs. For example, they can outperform LLMs in certain domains because they are trained on specific industries. LLMs do better in general knowledge.

Nvidia research found that small language models could prove more practical and more profitable in the enterprise because they are powerful enough for many real-world tasks, cost less to run and can be deployed at scale without the same infrastructure burden as large language models.

In systems where AI agents string together multiple steps to complete complex assignments, the bulk of the work doesn’t require the heaviest possible model. Instead, smaller models can handle most of the load, while LLMs can be reserved for rare, high-stakes steps.

The next phase of AI is about efficiency, building models that are smaller, faster and cheaper to run without sacrificing performance, PYMNTS reported in November. This strategy can allow enterprises to reduce total ownership costs at a time when nearly 47% of businesses of that size cite cost as the top barrier to deploying generative AI.

For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.

The post AT&T Slashes AI Costs 90% by Swapping Large Models for Small Ones appeared first on PYMNTS.com.

AT&T Slashes AI Costs 90% by Swapping Large Models for Small Ones

Recent Posts