Apple Study Reveals Critical Flaws in AI's Logical Reasoning Abilities

Apple's AI research team has uncovered significant weaknesses in the reasoning abilities of large language models, according to a newly published study.

Apple Silicon AI Optimized Feature Siri 1
The study, published on arXiv, outlines Apple's evaluation of a range of leading language models, including those from OpenAI, Meta, and other prominent developers, to determine how well these models could handle mathematical reasoning tasks. The findings reveal that even slight changes in the phrasing of questions can cause major discrepancies in model performance that can undermine their reliability in scenarios requiring logical consistency.

Apple draws attention to a persistent problem in language models: their reliance on pattern matching rather than genuine logical reasoning. In several tests, the researchers demonstrated that adding irrelevant information to a question—details that should not affect the mathematical outcome—can lead to vastly different answers from the models.

One example given in the paper involves a simple math problem asking how many kiwis a person collected over several days. When irrelevant details about the size of some kiwis were introduced, models such as OpenAI's o1 and Meta's Llama incorrectly adjusted the final total, despite the extra information having no bearing on the solution.

We found no evidence of formal reasoning in language models. Their behavior is better explained by sophisticated pattern matching—so fragile, in fact, that changing names can alter results by ~10%.

This fragility in reasoning prompted the researchers to conclude that the models do not use real logic to solve problems but instead rely on sophisticated pattern recognition learned during training. They found that "simply changing names can alter results," a potentially troubling sign for the future of AI applications that require consistent, accurate reasoning in real-world contexts.

According to the study, all models tested, from smaller open-source versions like Llama to proprietary models like OpenAI's GPT-4o, showed significant performance degradation when faced with seemingly inconsequential variations in the input data. Apple suggests that AI might need to combine neural networks with traditional, symbol-based reasoning called neurosymbolic AI to obtain more accurate decision-making and problem-solving abilities.

Popular Stories

iOS 26

When Will Apple Release iOS 26.2?

Monday December 1, 2025 4:37 pm PST by
We're getting closer to the launch of the final major iOS update of the year, with Apple set to release iOS 26.2 in December. We've had three betas so far and are expecting a fourth beta or a release candidate this week, so a launch could follow as soon as next week. Past Launch Dates Apple's past iOS x.2 updates from the last few years have all happened right around the middle of the...
maxresdefault

iPhone Fold: Launch, Pricing, and What to Expect From Apple's Foldable

Monday December 1, 2025 3:00 am PST by
Apple is expected to launch a new foldable iPhone next year, based on multiple rumors and credible sources. The long-awaited device has been rumored for years now, but signs increasingly suggest that 2026 could indeed be the year that Apple releases its first foldable device. Subscribe to the MacRumors YouTube channel for more videos. Below, we've collated an updated set of key details that ...
Sad Siri Feature

Apple AI Chief John Giannandrea Retiring After Siri Delays

Monday December 1, 2025 2:16 pm PST by
Apple AI chief John Giannandrea is stepping down from his position and retiring in spring 2026, Apple announced today. Giannandrea will serve as an advisor between now and 2026, with former Microsoft AI researcher Amar Subramanya set to take over as vice president of AI. Subramanya will report to Apple engineering chief Craig Federighi, and will lead Apple Foundation Models, ML research, and ...
Netflix Smaller 4

Netflix Kills Casting From Its Mobile App to Most Modern TVs

Monday December 1, 2025 4:36 am PST by
Netflix has quietly removed the ability to cast content from its mobile apps to most modern TVs and streaming devices, including newer Chromecast models and the Google TV Streamer. The change was first spotted by users on Reddit and confirmed in an updated Netflix support page (via Android Authority), which now states that the streaming service no longer supports casting from mobile devices...
Cyber Week Deals 2025

Best Cyber Week Apple Deals Include Big Discounts on AirPods, Apple Watch, and More

Sunday November 30, 2025 7:33 am PST by
Cyber Week is here, and you can find popular Apple products like AirPods, iPad, Apple Watch, and more at all-time low prices. In this article, the majority of the discounts will be found on Amazon. Note: MacRumors is an affiliate partner with some of these vendors. When you click a link and make a purchase, we may receive a small payment, which helps us keep the site running. Specifically,...
ios 18 to ios 26 upgrade

Apple Pushes iPhone Users Still on iOS 18 to Upgrade to iOS 26

Tuesday December 2, 2025 11:09 am PST by
Apple is encouraging iPhone users who are still running iOS 18 to upgrade to iOS 26 by making the iOS 26 software upgrade option more prominent. Since iOS 26 launched in September, it has been displayed as an optional upgrade at the bottom of the Software Update interface in the Settings app. iOS 18 has been the default operating system option, and users running iOS 18 have seen iOS 18...
Touchscreen MacBook Feature

Here Are the Four MacBooks Apple Is Expected to Launch Next Year

Monday December 1, 2025 5:00 am PST by
2026 could be a bumper year for Apple's Mac lineup, with the company expected to announce as many as four separate MacBook launches. Rumors suggest Apple will court both ends of the consumer spectrum, with more affordable options for students and feature-rich premium lines for users that seek the highest specifications from a laptop. Below is a breakdown of what we're expecting over the next ...
studio display purple february

M5 iPad Pro Could Hint at New Studio Display Feature

Sunday November 30, 2025 10:30 am PST by
The updated specs of the M5 iPad Pro may point toward a major new feature for Apple's next-generation Studio Display expected in early 2026. Apple's latest iPad Pro debuted last month and contains one display-related change that stands out: it can now drive external monitors at up to 120Hz with Adaptive Sync. The feature should deliver lower latency, smoother motion, and fewer visual...
iPhone Pocket Short

iPhone Pocket is Now Completely Sold Out Worldwide

Tuesday November 25, 2025 7:16 am PST by
Apple recently teamed up with Japanese fashion brand ISSEY MIYAKE to create the iPhone Pocket, a limited-edition knitted accessory designed to carry an iPhone. However, it is now completely sold out in all countries where it was released. iPhone Pocket became available to order on Apple's online store starting Friday, November 14, in the United States, France, China, Italy, Japan, Singapore, ...
iphone 17 cyber

iPhone 17 Demand Is Breaking Apple's Sales Records

Tuesday December 2, 2025 9:44 am PST by
Apple's iPhone 17 lineup is selling well enough that Apple is on track to ship more than 247.4 million total iPhones in 2025, according to a new report from IDC. Total 2025 shipments are forecast to grow 6.1 percent year over year due to iPhone 17 demand and increased sales in China, a major market for Apple. Overall worldwide smartphone shipments across Android and iOS are forecast to...

Top Rated Comments

Timpetus Avatar
15 months ago
If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?
Score: 61 Votes (Like | Disagree)
johnediii Avatar
15 months ago
All you have to do to avoid the coming rise of the machines is change your name. :)
Score: 33 Votes (Like | Disagree)
Mitthrawnuruodo Avatar
15 months ago
This shows quite clearly that LLMs aren't "intelligent" in any reasonable sense of the word, they're just highly advanced at (speech/writing) pattern recognition.

Basically electronic parrots.

They can be highly useful, though. I've used Chat-GPT (4o with canvas and o1-preview) quite a lot for tweaking code examples to show in class, for instance.
Score: 27 Votes (Like | Disagree)
jaster2 Avatar
15 months ago
Apple should know how asking for something in different ways can skew results. Siri has been demonstrating that quite effectively for years.
Score: 26 Votes (Like | Disagree)
applezulu Avatar
15 months ago

If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?
Much of it is just popular hype from people who don't know enough to know the difference. Think of the NY Times article that sort of kicked it all off in the popular media a couple of years ago. The writer seemed convinced that the AI was obsessing over him and actually asking him to leave his wife. The actual transcript for anyone who's seen this stuff back through the decades, showed the AI program bouncing off programmed parameters and being pushed by the writer into shallow territory where it lacked sufficient data to create logical interactions. The writer and most people reading it, however, thought the AI was being borderline sentient.

The simpler occam's razor explanation why AI businesses have rolled with that perception or at least haven't tried much to refute it, is that it provides cover for the LLM "learning" process that steals copyrighted intellectual property and then regurgitates it in whole or in collage form. The sheen of possible sentience clouds the theft ("people also learn by consuming the work of others") as well as the plagiarism ("people are influenced by the work of others, so what then constitutes originality?"). When it's made clear that LLM AI is merely hoovering, blending and regurgitating with no involvement of any sort of reasoning process, it becomes clear that the theft of intellectual property is just that: theft of intellectual property.
Score: 24 Votes (Like | Disagree)
Photoshopper Avatar
15 months ago
Why has no one else reported this? It took the “newcomer” Apple to figure it out and to tell the truth?
Score: 19 Votes (Like | Disagree)