Apple's New Transcription APIs Blow Past Whisper in Speed Tests

Apple's new speech-to-text transcription APIs in iOS 26 and macOS Tahoe are delivering dramatically faster speeds compared to rival tools, including OpenAI's Whisper, based on beta testing conducted by MacStories' John Voorhees.

apple record transcribe phone calls

Call recording and transcription in iOS 18.1

Apple uses its own native speech frameworks to power live transcription features in apps like Notes and Voice Memos, as well as phone call transcription in iOS 18.1. To improve efficiency in iOS 26 and macOS Tahoe, Apple has introduced a new SpeechAnalyzer class and SpeechTranscriber module that deal with similar requests.

According to Voorhees, the new models processed a 34-minute, 7GB video file in just 45 seconds using a command line tool called Yap (developed by Voorhees' son, Finn). That's a full 55% faster than MacWhisper's Large V3 Turbo model, which took 1 minute and 41 seconds for the same file.

Other Whisper-based tools performed even slower, with VidCap taking 1:55 and MacWhisper's Large V2 model requiring 3:55 to complete the same transcription task. Voorhees also reported no noticeable difference in transcription quality across models.

The speed advantage comes from Apple's on-device processing approach, which avoids the network overhead that typically slows cloud-based transcription services.

While the time difference might seem modest for individual files, Voorhees notes that the performance gain increases exponentially when processing multiple videos or longer content. For anyone generating subtitles or transcribing lectures regularly, the efficiency boost could save them hours.

The Speech framework components are available across iPhone, iPad, Mac, and Vision Pro platforms in the current beta releases. Voorhees expects Apple's transcription technology to eventually replace Whisper as the go-to solution for Mac transcription apps.

Related Roundups: iOS 26, iPadOS 26, macOS Tahoe 26
Related Forums: iOS 26, macOS Tahoe

Popular Stories

iOS 26 on Three iPhones

Here's When to Expect the iOS 26 Public Beta

Tuesday July 15, 2025 11:07 am PDT by
Apple previously announced that a public beta of iOS 26 would be available in July, and now a more specific timeframe has surfaced. Bloomberg's Mark Gurman today said that Apple's public betas should be released on or around Wednesday, July 23. In other words, expect the public betas of iOS 26, iPadOS 26, macOS 26, and more to be available at some point next week. Apple will be releasing...
iPhone 17 Colors

All 15 New iPhone 17 and iPhone 17 Pro Colors Revealed in Latest Leak

Wednesday July 16, 2025 6:50 am PDT by
We may finally have a definitive list of all color options for the iPhone 17 series, ahead of the devices launching in September. MacRumors concept In a report for Macworld today, Filipe Espósito said he obtained an "internal document" that allegedly reveals all of the color options for the upcoming iPhone 17, iPhone 17 Air, iPhone 17 Pro, and iPhone 17 Pro Max models. The report includes ...
Apple Watch Ultra 2 Complications

Apple Watch Ultra 3: What to Expect

Sunday July 13, 2025 10:30 am PDT by
The long wait for an Apple Watch Ultra 3 is nearly over, and a handful of new features and changes have been rumored for the device. Below, we recap what to expect from the Apple Watch Ultra 3:Satellite connectivity for sending and receiving text messages when Wi-Fi and cellular coverage is unavailable 5G support, up from LTE on the Apple Watch Ultra 2 Likely a wide-angle OLED display that ...
iPhone 17 Pro in Hand Feature Lowgo

iPhone 17 Pro Coming Soon With These 16 New Features

Friday July 11, 2025 12:40 pm PDT by
Apple's next-generation iPhone 17 Pro and iPhone 17 Pro Max are only two months away, and there are plenty of rumors about the devices. Below, we recap key changes rumored for the iPhone 17 Pro models. Latest Rumors These rumors surfaced in June and July:A redesigned Dynamic Island: It has been rumored that all iPhone 17 models will have a redesigned Dynamic Island interface — it might ...
Apple Hornsby

Apple Store Near Sydney Permanently Closing Later This Year

Monday July 14, 2025 6:14 pm PDT by
Apple today said its store at the Westfield Hornsby shopping mall, in Hornsby, Australia, will be permanently closing in October. Apple Hornsby In a statement shared with Australian tech news website EFTM (via Reddit), Apple said that it has decided not to renew its lease at Westfield Hornsby. Apple said all affected retail employees will be given the opportunity to work at Apple's nearby...
Foldable iPhone 2023 Feature Homescreen

Foldable iPhone's Thickness and Price Range Detailed in New Reports

Wednesday July 16, 2025 11:31 am PDT by
Apple's long-rumored foldable iPhone will likely have a starting price between $1,800 and $2,000 in the U.S., analysts at investment banking firm UBS said this week. If so, the foldable iPhone would cost more than a MacBook Pro, which starts at $1,599. With a starting price of at least $1,800, the foldable iPhone would be the most expensive iPhone model ever released, topping the Pro Max at...
iPhone 17 Pro Dark Blue and Orange

Ranked: The Best Features Rumored for the iPhone 17 Lineup

Wednesday July 16, 2025 4:17 pm PDT by
We have just under two months to go until the debut of Apple's iPhone 17 models, and rumors have been ramping up in recent weeks. We went through everything we know so far, pulling out the most exciting rumors and highlighting some other changes that aren't going to be so great. Top Tier Ultra Thin iPhone 17 Air - The iPhone 17 Air is 2025's most exciting iPhone rumor, because it's the...

Top Rated Comments

Big_D Avatar
4 weeks ago
Impressive, if it is accurate. What the story doesn't mention is how accurate each of those transcriptions was? Were they all identical? Did one or other have more mistakes? What is the accuracy percentage for each one, and how badly wrong were those mistakes?

I'm not trying to defend ChatGPT, just the speed is a single metric, which isn't very useful if the results are garbage. If the Apple one is faster and more accurate, that is incredible, faster and as accurate, impressive, faster but full of errors, not really that useful.

Hopefully it is the first one: it is faster and more accurate.
Score: 26 Votes (Like | Disagree)
neuropsychguy Avatar
4 weeks ago

Impressive, if it is accurate. What the story doesn't mention is how accurate each of those transcriptions was? Were they all identical? Did one or other have more mistakes? What is the accuracy percentage for each one, and how badly wrong were those mistakes?

I'm not trying to defend ChatGPT, just the speed is a single metric, which isn't very useful if the results are garbage. If the Apple one is faster and more accurate, that is incredible, faster and as accurate, impressive, faster but full of errors, not really that useful.

Hopefully it is the first one: it is faster and more accurate.
Nothing scientific, but in the MacStories post: "What stood out above all else was Yap’s speed. By harnessing SpeechAnalyzer and SpeechTranscriber on-device, the command line tool tore through the 7GB video file a full 55% faster than MacWhisper’s Large V3 Turbo model, with no noticeable difference in transcription quality."

It would be good to see more formal comparisons with data you suggested. Also, it would be good to know what computer John was using for the test.
Score: 17 Votes (Like | Disagree)
Big_D Avatar
4 weeks ago

Impressive, if it is accurate.
OK, I read the original article, they all had similar problems with the podcast name, AppStories, writing it as two words instead of CamelCasing it, which is acceptable, and they all had similar problems with people's names. But the Apple tools weren't any less accurate, despite being much faster.
Score: 15 Votes (Like | Disagree)
jmonster Avatar
4 weeks ago
Not mentioning accuracy at all implies it's not. Lots of models are faster than O3, but they're not better.

This is just silly getting sillier. Write something meaningful.

Whisper works in real time. Anything faster is irrelevant for iOS.

And saying it's because network overhead? When you can run OpenAI's whisper locally?....... mhm.

This is a blatant advertisement just regurgitating apples marketing bullets.
Score: 7 Votes (Like | Disagree)
klasma Avatar
4 weeks ago
Speech-to-text is a good use case for on-device processing, but yes, accuracy is an important question, not to mention (multi-)language support.
Score: 5 Votes (Like | Disagree)
Basic75 Avatar
4 weeks ago

While the time difference might seem modest for individual files, Voorhees notes that the performance gain increases exponentially when processing multiple videos or longer content.
That's not how it works. Recommend maths lesson.
Score: 4 Votes (Like | Disagree)