Apple Study Reveals Critical Flaws in AI's Logical Reasoning Abilities

Apple's AI research team has uncovered significant weaknesses in the reasoning abilities of large language models, according to a newly published study.

Apple Silicon AI Optimized Feature Siri 1
The study, published on arXiv, outlines Apple's evaluation of a range of leading language models, including those from OpenAI, Meta, and other prominent developers, to determine how well these models could handle mathematical reasoning tasks. The findings reveal that even slight changes in the phrasing of questions can cause major discrepancies in model performance that can undermine their reliability in scenarios requiring logical consistency.

Apple draws attention to a persistent problem in language models: their reliance on pattern matching rather than genuine logical reasoning. In several tests, the researchers demonstrated that adding irrelevant information to a question—details that should not affect the mathematical outcome—can lead to vastly different answers from the models.

One example given in the paper involves a simple math problem asking how many kiwis a person collected over several days. When irrelevant details about the size of some kiwis were introduced, models such as OpenAI's o1 and Meta's Llama incorrectly adjusted the final total, despite the extra information having no bearing on the solution.

We found no evidence of formal reasoning in language models. Their behavior is better explained by sophisticated pattern matching—so fragile, in fact, that changing names can alter results by ~10%.

This fragility in reasoning prompted the researchers to conclude that the models do not use real logic to solve problems but instead rely on sophisticated pattern recognition learned during training. They found that "simply changing names can alter results," a potentially troubling sign for the future of AI applications that require consistent, accurate reasoning in real-world contexts.

According to the study, all models tested, from smaller open-source versions like Llama to proprietary models like OpenAI's GPT-4o, showed significant performance degradation when faced with seemingly inconsequential variations in the input data. Apple suggests that AI might need to combine neural networks with traditional, symbol-based reasoning called neurosymbolic AI to obtain more accurate decision-making and problem-solving abilities.

Popular Stories

mac mini thermal architecture feature

New Mac Mini Has Modular Storage, 256GB Model Will Have Faster SSD

Friday November 8, 2024 7:06 am PST by
Apple has returned to using two 128GB storage chips in the new Mac mini with 256GB of storage, according to a partial teardown video shared on social media today. This means the base-model Mac mini with the M4 chip will not have significantly slower SSD speeds compared to higher-end configurations of the computer with 512GB, 1TB, or 2TB of storage, as multiple NAND chips allows for faster SSD...
best buy holiday

Best Buy Reveals Black Friday Plans With Sitewide Sales Available Now

Friday November 8, 2024 10:05 am PST by
Black Friday sales are continuing today with Best Buy kicking off early Black Friday deals that will last for the next few days. Similar to other retailers, Best Buy's early Black Friday event includes sitewide savings on Apple products, headphones, TVs, monitors, video games, and more. Note: MacRumors is an affiliate partner with Best Buy. When you click a link and make a purchase, we may...
iphone passcode green

Cops Suspect iOS 18 iPhones Are Communicating to Force Reboots, Making Unlocking Harder

Thursday November 7, 2024 2:20 pm PST by
Law enforcement officials in Detroit, Michigan are warning other police officers about an alleged iPhone change that causes Apple devices stored for forensic examination to spontaneously restart, reports 404 Media. iPhones that are undergoing examination have apparently been rebooting, which makes them harder to unlock with brute force methods, and Michigan police think that it's due to a...
Generic iOS 18

Everything New in iOS 18.2 Beta 2

Monday November 4, 2024 12:34 pm PST by
Apple today seeded the second betas of upcoming iOS 18.2 and iPadOS 18.2 updates to developers, and Apple is continuing to refine the Apple Intelligence capabilities. There are also a handful of smaller features that are worth knowing about. Find My Find My has a new option to Share Item Location with an "airline or trusted person" that can help you locate something that you've misplaced....
M4 MacBook Pros Thumb

M4 MacBook Pro Reviews: Processor Benchmarks Impress, New Nano-Texture Option Worth the Extra $150

Thursday November 7, 2024 6:14 am PST by
The first wave of reviews of Apple's new M4-powered MacBook Pro models were published this morning. We've collected some of the latest impressions from YouTube channels and select media outlets below. Apple last month announced the new 14-inch and 16-inch MacBook Pro models, adding next-generation M4, M4 Pro, and M4 Max chips, with Thunderbolt 5 ports on higher-end models, display and camera ...
High Power Mode Feature 2

Apple Expands High Power Mode to MacBook Pro and Mac Mini Models With M4 Pro Chip

Thursday November 7, 2024 12:15 pm PST by
High Power Mode is available on the 14-inch MacBook Pro, 16-inch MacBook Pro, and Mac mini models with the M4 Pro chip, according to Ars Technica's Andrew Cunningham. The feature was previously limited to Macs with Apple's highest-end "Max" chip, so this is the first time it is available on Macs with a "Pro" chip. This is the second time that Apple has expanded availability of High Power...
early apple watch black friday

The Best Early Black Friday Apple Watch Deals

Wednesday November 6, 2024 6:33 am PST by
Black Friday is just around the corner, and Apple Watch deals have begun appearing ahead of the shopping holiday on November 29. In this article, we'll take a look at all of the best early Black Friday Apple Watch deals, including the new Series 10 models. Note: MacRumors is an affiliate partner with some of these vendors. When you click a link and make a purchase, we may receive a small...
iOS 18 Notes Feature

How to Fix iPhone Notes Disappearing After Accepting New iCloud Terms

Thursday November 7, 2024 7:57 am PST by
Apple in September updated its iCloud terms and conditions with some minor changes, and this week it has been notifying iPhone users that they must accept the revised terms in order to continue using iCloud. Unfortunately, after accepting the new terms, some iPhone users have seen all of their notes disappear in the Notes app. While some users have turned to social media to justifiably panic ...

Top Rated Comments

Timpetus Avatar
4 weeks ago
If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?
Score: 61 Votes (Like | Disagree)
johnediii Avatar
4 weeks ago
All you have to do to avoid the coming rise of the machines is change your name. :)
Score: 33 Votes (Like | Disagree)
Mitthrawnuruodo Avatar
4 weeks ago
This shows quite clearly that LLMs aren't "intelligent" in any reasonable sense of the word, they're just highly advanced at (speech/writing) pattern recognition.

Basically electronic parrots.

They can be highly useful, though. I've used Chat-GPT (4o with canvas and o1-preview) quite a lot for tweaking code examples to show in class, for instance.
Score: 27 Votes (Like | Disagree)
jaster2 Avatar
4 weeks ago
Apple should know how asking for something in different ways can skew results. Siri has been demonstrating that quite effectively for years.
Score: 26 Votes (Like | Disagree)
applezulu Avatar
4 weeks ago

If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?
Much of it is just popular hype from people who don't know enough to know the difference. Think of the NY Times article that sort of kicked it all off in the popular media a couple of years ago. The writer seemed convinced that the AI was obsessing over him and actually asking him to leave his wife. The actual transcript for anyone who's seen this stuff back through the decades, showed the AI program bouncing off programmed parameters and being pushed by the writer into shallow territory where it lacked sufficient data to create logical interactions. The writer and most people reading it, however, thought the AI was being borderline sentient.

The simpler occam's razor explanation why AI businesses have rolled with that perception or at least haven't tried much to refute it, is that it provides cover for the LLM "learning" process that steals copyrighted intellectual property and then regurgitates it in whole or in collage form. The sheen of possible sentience clouds the theft ("people also learn by consuming the work of others") as well as the plagiarism ("people are influenced by the work of others, so what then constitutes originality?"). When it's made clear that LLM AI is merely hoovering, blending and regurgitating with no involvement of any sort of reasoning process, it becomes clear that the theft of intellectual property is just that: theft of intellectual property.
Score: 24 Votes (Like | Disagree)
Photoshopper Avatar
4 weeks ago
Why has no one else reported this? It took the “newcomer” Apple to figure it out and to tell the truth?
Score: 19 Votes (Like | Disagree)