Apple Teams Up With NVIDIA to Speed Up AI Language Models
Apple has shared details on a collaboration with NVIDIA to greatly improve the performance of large language models (LLMs) by implementing a new text generation technique that offers substantial speed improvements for AI applications.

Apple earlier this year published and open-sourced Recurrent Drafter (ReDrafter), an approach that combines beam search and dynamic tree attention methods to accelerate text generation. Beam search explores multiple potential text sequences at once for better results, while tree attention organizes and removes redundant overlaps among these sequences to improve efficiency.
Apple has now integrated the technology into NVIDIA's TensorRT-LLM framework, which optimizes LLMs running on NVIDIA GPUs, where it achieved "state of the art performance," according to Apple. The integration saw the technique manage a 2.7x speed increase in tokens generated per second during testing with a production model containing tens of billions of parameters.
Apple says the improved performance not only reduces user-perceived latency but also leads to decreased GPU usage and power consumption. From Apple's Machine Learning Research blog:
"LLMs are increasingly being used to power production applications, and improving inference efficiency can both impact computational costs and reduce latency for users. With ReDrafter's novel approach to speculative decoding integrated into the NVIDIA TensorRT-LLM framework, developers can now benefit from faster token generation on NVIDIA GPUs for their production LLM applications."
Developers interested in implementing ReDrafter can find detailed information on both Apple's website and NVIDIA's developer blog.
Popular Stories
Apple hasn't updated the AirPods Pro since 2022, and the earbuds are due for a refresh. We're counting on a new model this year, and we've seen several hints of new AirPods tucked away in Apple's code. Rumors suggest that Apple has some exciting new features planned that will make it worthwhile to upgrade to the latest model.
Subscribe to the MacRumors YouTube channel for more videos.
Heal...
Chase this week announced a series of new perks for its premium Sapphire Reserve credit card, and one of them is for a pair of Apple services.
Specifically, the credit card now offers complimentary annual subscriptions to Apple TV+ and Apple Music, a value of up to $250 per year.
If you are already paying for Apple TV+ and/or Apple Music directly through Apple, those subscriptions will...
Popular accessory maker Anker this month launched two separate recalls for its power banks, some of which may be a fire risk.
The first recall affects Anker PowerCore 10000 Power Banks sold between June 1, 2016 and December 31, 2022 in the United States. Anker says that these power banks have a "potential issue" with the battery inside, which can lead to overheating, melting of plastic...
In 2020, Apple added a digital car key feature to its Wallet app, allowing users to lock, unlock, and start a compatible vehicle with an iPhone or Apple Watch. The feature is currently offered by select automakers, including Audi, BMW, Hyundai, Kia, Genesis, Mercedes-Benz, Volvo, and a handful of others, and it is set to expand further.
During its WWDC 2025 keynote, Apple said that 13...
Apple's next-generation iPhone 17 Pro and iPhone 17 Pro Max are around three months away, and there are plenty of rumors about the devices.
Apple is expected to launch the iPhone 17, iPhone 17 Air, iPhone 17 Pro, and iPhone 17 Pro Max in September this year.
Below, we recap key changes rumored for the iPhone 17 Pro models:Aluminum frame: iPhone 17 Pro models are rumored to have an...
Apple last month announced the launch of CarPlay Ultra, the long-awaited next-generation version of its CarPlay software system for vehicles.
There was news this week about which automakers will and won't offer CarPlay Ultra, and we have provided an updated list below.
CarPlay Ultra is currently limited to newer Aston Martin vehicles in the U.S. and Canada. Fortunately, if you cannot...
Apple will finally deliver the Apple Watch Ultra 3 sometime this year, according to analyst Jeff Pu of GF Securities Hong Kong (via @jukanlosreve).
The analyst expects both the Apple Watch Series 11 and Apple Watch Ultra 3 to arrive this year (likely alongside the new iPhone 17 lineup, if previous launches are anything to go by), according to his latest product roadmap shared with...
Apple is planning to launch a low-cost MacBook powered by an iPhone chip, according to Apple analyst Ming-Chi Kuo.
In an article published on X, Kuo explained that the device will feature a 13-inch display and the A18 Pro chip, making it the first Mac powered by an iPhone chip. The A18 Pro chip debuted in the iPhone 16 Pro last year. To date, all Apple silicon Macs have contained M-series...