Apple Study Reveals Critical Flaws in AI's Logical Reasoning Abilities

Apple's AI research team has uncovered significant weaknesses in the reasoning abilities of large language models, according to a newly published study.

Apple Silicon AI Optimized Feature Siri 1
The study, published on arXiv, outlines Apple's evaluation of a range of leading language models, including those from OpenAI, Meta, and other prominent developers, to determine how well these models could handle mathematical reasoning tasks. The findings reveal that even slight changes in the phrasing of questions can cause major discrepancies in model performance that can undermine their reliability in scenarios requiring logical consistency.

Apple draws attention to a persistent problem in language models: their reliance on pattern matching rather than genuine logical reasoning. In several tests, the researchers demonstrated that adding irrelevant information to a question—details that should not affect the mathematical outcome—can lead to vastly different answers from the models.

One example given in the paper involves a simple math problem asking how many kiwis a person collected over several days. When irrelevant details about the size of some kiwis were introduced, models such as OpenAI's o1 and Meta's Llama incorrectly adjusted the final total, despite the extra information having no bearing on the solution.

We found no evidence of formal reasoning in language models. Their behavior is better explained by sophisticated pattern matching—so fragile, in fact, that changing names can alter results by ~10%.

This fragility in reasoning prompted the researchers to conclude that the models do not use real logic to solve problems but instead rely on sophisticated pattern recognition learned during training. They found that "simply changing names can alter results," a potentially troubling sign for the future of AI applications that require consistent, accurate reasoning in real-world contexts.

According to the study, all models tested, from smaller open-source versions like Llama to proprietary models like OpenAI's GPT-4o, showed significant performance degradation when faced with seemingly inconsequential variations in the input data. Apple suggests that AI might need to combine neural networks with traditional, symbol-based reasoning called neurosymbolic AI to obtain more accurate decision-making and problem-solving abilities.

Popular Stories

M5 MacBook Pro

Apple Announces New 14-Inch MacBook Pro With M5 Chip

Wednesday October 15, 2025 6:07 am PDT by
Apple today updated the 14-inch MacBook Pro base model with its new M5 chip, which is also available in updated iPad Pro and Vision Pro models. In addition, the base 14-inch MacBook Pro can now be configured with up to 4TB of storage on Apple's online store, whereas the previous model maxed out at 2TB. However, the maximum amount of unified RAM available for this model remains 32GB. Like...
Apple iPad Pro hero M5

Apple Debuts New iPad Pro With M5 Chip, Faster Charging, and More

Wednesday October 15, 2025 6:16 am PDT by
Apple today announced the next-generation iPad Pro, featuring the custom-designed M5, C1X, and N1 chips. The M5 chip has up to a 10-core CPU, with four performance cores and six efficiency cores. It features a next-generation GPU with Neural Accelerator in each core, allowing the new iPad Pro to deliver up to 3.5x the AI performance than the previous model, and a third-generation ray-tracing ...
maxresdefault

Here's Everything Apple Announced Today

Wednesday October 15, 2025 3:54 pm PDT by
We didn't get a second fall event this year, but Apple did unveil updated products with a series of press releases that went out today. The M5 chip made an appearance in new MacBook Pro, Vision Pro, and iPad Pro models. Subscribe to the MacRumors YouTube channel for more videos. We've rounded up our coverage and highlighted the main feature changes for each device below. MacBook Pro M5...
iphone air thickness

Apple Said to Cut iPhone Air Production Amid Underwhelming Sales

Friday October 17, 2025 8:29 am PDT by
Apple plans to cut production of the iPhone Air amid underwhelming sales performance, Japan's Mizuho Securities believes (via The Elec). The Japanese investment banking and securities firm claims that the iPhone 17 Pro and iPhone 17 Pro Max are seeing higher sales than their predecessors during the same period last year, while the standard iPhone 17 is a major success, performing...
HomePod mini and Apple TV

Apple's Next Rumored Products: New HomePod Mini, Apple TV, and More

Thursday October 16, 2025 9:13 am PDT by
Apple on Wednesday updated the 14-inch MacBook Pro, iPad Pro, and Vision Pro with its next-generation M5 chip, but previous rumors have indicated that the company still plans to announce at least a few additional products before the end of the year. The following Apple products have at one point been rumored to be updated in 2025, although it is unclear if the timeframe for any of them has...
Vision Pro M5 Announcement

Apple Updates Vision Pro With M5 Chip, Dual Knit Band, and 120Hz Support

Wednesday October 15, 2025 6:14 am PDT by
Apple today updated the Vision Pro headset with its next-generation M5 chip for faster performance, and a more comfortable Dual Knit Band. The M5 chip has a 10-core CPU, a 10-core GPU with Neural Accelerators, and a 16-core Neural Engine, and we have confirmed the Vision Pro still has 16GB of RAM. With the M5 chip, the Vision Pro offers faster performance and longer battery life compared...
14 inch MacBook Pro Keyboard

New 14-Inch MacBook Pro Has Two Key Upgrades Beyond the M5 Chip

Thursday October 16, 2025 8:31 am PDT by
Apple on Wednesday updated the 14-inch MacBook Pro base model with an M5 chip, and there are two key storage-related upgrades beyond that chip bump. First, Apple says the new 14-inch MacBook Pro offers up to 2× faster SSD performance than the equivalent previous-generation model, so read and write speeds should get a significant boost. Apple says it is using "the latest storage technology," ...
MacBook Pro M5 Screen

New MacBook Pro Does Not Include a Charger in the Box in Europe

Wednesday October 15, 2025 6:59 am PDT by
The new 14-inch MacBook Pro with an M5 chip does not include a charger in the box in European countries, including the U.K., Ireland, Germany, Italy, France, Spain, the Netherlands, Norway, and others, according to Apple's online store. In the U.S. and all other countries outside of Europe, the new MacBook Pro comes with Apple's 70W USB-C Power Adapter, but European customers miss out....
airpods max 2024 colors

AirPods Max 2: Everything We Know So Far

Tuesday October 14, 2025 8:43 am PDT by
Apple's AirPods Max have now been available for almost five years, so what do we know about the second-generation version? According to Apple supply chain analyst Ming-Chi Kuo, the new AirPods Max will be lighter than the current ones, but exactly how much is as yet known. The current AirPods Max weigh 0.85 pounds (386.2 grams), excluding the charging case, making it one of the heavier...
macbook pro blue

Apple's M5 MacBook Pro Imminent: What to Expect

Tuesday October 14, 2025 4:35 pm PDT by
Apple is going to launch a new version of the MacBook Pro as soon as tomorrow, so we thought we'd go over what to expect from Apple's upcoming Mac. M5 Chip The MacBook Pro will be one of the first new devices to use the next-generation M5 chip, which will replace the M4 chip. The M5 is built on TSMC's more advanced 3-nanometer process, and it will bring speed and efficiency improvements. ...

Top Rated Comments

Timpetus Avatar
13 months ago
If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?
Score: 61 Votes (Like | Disagree)
johnediii Avatar
13 months ago
All you have to do to avoid the coming rise of the machines is change your name. :)
Score: 33 Votes (Like | Disagree)
Mitthrawnuruodo Avatar
13 months ago
This shows quite clearly that LLMs aren't "intelligent" in any reasonable sense of the word, they're just highly advanced at (speech/writing) pattern recognition.

Basically electronic parrots.

They can be highly useful, though. I've used Chat-GPT (4o with canvas and o1-preview) quite a lot for tweaking code examples to show in class, for instance.
Score: 27 Votes (Like | Disagree)
jaster2 Avatar
13 months ago
Apple should know how asking for something in different ways can skew results. Siri has been demonstrating that quite effectively for years.
Score: 26 Votes (Like | Disagree)
applezulu Avatar
13 months ago

If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?
Much of it is just popular hype from people who don't know enough to know the difference. Think of the NY Times article that sort of kicked it all off in the popular media a couple of years ago. The writer seemed convinced that the AI was obsessing over him and actually asking him to leave his wife. The actual transcript for anyone who's seen this stuff back through the decades, showed the AI program bouncing off programmed parameters and being pushed by the writer into shallow territory where it lacked sufficient data to create logical interactions. The writer and most people reading it, however, thought the AI was being borderline sentient.

The simpler occam's razor explanation why AI businesses have rolled with that perception or at least haven't tried much to refute it, is that it provides cover for the LLM "learning" process that steals copyrighted intellectual property and then regurgitates it in whole or in collage form. The sheen of possible sentience clouds the theft ("people also learn by consuming the work of others") as well as the plagiarism ("people are influenced by the work of others, so what then constitutes originality?"). When it's made clear that LLM AI is merely hoovering, blending and regurgitating with no involvement of any sort of reasoning process, it becomes clear that the theft of intellectual property is just that: theft of intellectual property.
Score: 24 Votes (Like | Disagree)
Photoshopper Avatar
13 months ago
Why has no one else reported this? It took the “newcomer” Apple to figure it out and to tell the truth?
Score: 19 Votes (Like | Disagree)