Apple Study Reveals Critical Flaws in AI's Logical Reasoning Abilities

Apple's AI research team has uncovered significant weaknesses in the reasoning abilities of large language models, according to a newly published study.

Apple Silicon AI Optimized Feature Siri 1
The study, published on arXiv, outlines Apple's evaluation of a range of leading language models, including those from OpenAI, Meta, and other prominent developers, to determine how well these models could handle mathematical reasoning tasks. The findings reveal that even slight changes in the phrasing of questions can cause major discrepancies in model performance that can undermine their reliability in scenarios requiring logical consistency.

Apple draws attention to a persistent problem in language models: their reliance on pattern matching rather than genuine logical reasoning. In several tests, the researchers demonstrated that adding irrelevant information to a question—details that should not affect the mathematical outcome—can lead to vastly different answers from the models.

One example given in the paper involves a simple math problem asking how many kiwis a person collected over several days. When irrelevant details about the size of some kiwis were introduced, models such as OpenAI's o1 and Meta's Llama incorrectly adjusted the final total, despite the extra information having no bearing on the solution.

We found no evidence of formal reasoning in language models. Their behavior is better explained by sophisticated pattern matching—so fragile, in fact, that changing names can alter results by ~10%.

This fragility in reasoning prompted the researchers to conclude that the models do not use real logic to solve problems but instead rely on sophisticated pattern recognition learned during training. They found that "simply changing names can alter results," a potentially troubling sign for the future of AI applications that require consistent, accurate reasoning in real-world contexts.

According to the study, all models tested, from smaller open-source versions like Llama to proprietary models like OpenAI's GPT-4o, showed significant performance degradation when faced with seemingly inconsequential variations in the input data. Apple suggests that AI might need to combine neural networks with traditional, symbol-based reasoning called neurosymbolic AI to obtain more accurate decision-making and problem-solving abilities.

Popular Stories

Generic iOS 18

Apple Announces iOS 18.2 Launching Today With These New Features

Wednesday December 11, 2024 5:23 am PST by
Apple has announced that iOS 18.2, iPadOS 18.2, and macOS Sequoia 15.2 will be released today following more than six weeks of beta testing. For the iPhone 15 Pro and iPhone 16 models, the update introduces additional Apple Intelligence features, including Genmoji for creating custom emoji, Image Playground and Image Wand for generating images, and ChatGPT integration for Siri. There is also ...
Generic iOS 18

Apple Seeds Second Release Candidate Versions of iOS 18.2 and More With Genmoji, Image Playground and ChatGPT Integration

Monday December 9, 2024 10:06 am PST by
Apple today seeded the second release candidate versions of upcoming iOS 18.2, iPadOS 18.2, and macOS 15.2 updates to developers and public beta testers for testing purposes, a week after releasing the first RCs. The first iOS 18.2 RC had a build number of 22C150, while the second RC's build number is 22C151. Release candidates represent the final version of beta software that's expected to see a ...
iPhone SE 4 Single Camera Thumb 3

iPhone SE 4 Said to Feature 48MP Rear Lens, 12MP TrueDepth Camera

Monday December 9, 2024 4:48 am PST by
Apple's forthcoming iPhone SE 4 will feature a single 48-megapixel rear camera and a 12-megapixel TrueDepth camera on the front, according to details revealed in a new Korean supply chain report. ET News reports that Korea-based LG Innotek is the main supplier of the front and rear camera modules for the more budget-friendly ~$400 device, which is expected to launch in the first quarter of...
iOS 18

Here Are Apple's Full Release Notes for iOS 18.2

Thursday December 5, 2024 11:48 am PST by
Apple seeded the release candidate version of iOS 18.2 today, which means it's going to see a public launch imminently. Release candidates represent the final version of new software that will be provided to the public should no last minute bugs be found, and Apple includes release notes with the RC launch. The iOS 18.2 release notes provide a look at all of the new features that are coming...
New Things Your iPhone Can Do in iOS 18

20 New Things Your iPhone Can Do in iOS 18.2

Friday December 6, 2024 4:42 am PST by
Apple is set to release iOS 18.2 in the second week of December, bringing the second round of Apple Intelligence features to iPhone 15 Pro and iPhone 16 models. This update brings several major advancements to Apple's AI integration, including completely new image generation tools and a range of Visual Intelligence-based enhancements. There are a handful of new non-AI related feature controls...
Apple MacBook Pro M4 hero

MacBook Pros With OLED Displays Won't Have a Notch, Roadmap Shows

Monday December 9, 2024 7:36 am PST by
Apple plans to remove the notch from the MacBook Pro in a few years from now, according to a roadmap shared by research firm Omdia. The roadmap shows that 14-inch and 16-inch MacBook Pro models released in 2026 will have a hole-punch camera at the top of the display, instead of a notch. It is unclear if there would simply be a pinhole in the display, or if Apple would expand the iPhone's...
airpods pro 2 gradient

AirPods Pro 3 Expected Next Year: Here's What We Know

Thursday November 28, 2024 3:30 am PST by
Despite being released over two years ago, Apple's AirPods Pro 2 continue to dominate the wireless earbud market. However, with the AirPods Pro 3 expected to launch sometime in 2025, anyone thinking of buying Apple's premium earbuds may be wondering if the next generation is worth holding out for. Apart from their audio and noise-canceling performance, which are generally regarded as...
vipps nfc tap to pay iphone

World's First Apple Pay Alternative for iPhone Launches in Norway

Monday December 9, 2024 1:28 am PST by
Norwegian payment service Vipps has become the world's first company to launch a competing tap-to-pay solution to Apple Pay on iPhone, following Apple's agreement with European regulators to open up its NFC technology to third parties. Starting December 9, Vipps users in Norway can make contactless payments in stores using their iPhones. The service initially supports customers of SpareBank...

Top Rated Comments

Timpetus Avatar
8 weeks ago
If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?
Score: 61 Votes (Like | Disagree)
johnediii Avatar
8 weeks ago
All you have to do to avoid the coming rise of the machines is change your name. :)
Score: 33 Votes (Like | Disagree)
Mitthrawnuruodo Avatar
8 weeks ago
This shows quite clearly that LLMs aren't "intelligent" in any reasonable sense of the word, they're just highly advanced at (speech/writing) pattern recognition.

Basically electronic parrots.

They can be highly useful, though. I've used Chat-GPT (4o with canvas and o1-preview) quite a lot for tweaking code examples to show in class, for instance.
Score: 27 Votes (Like | Disagree)
jaster2 Avatar
8 weeks ago
Apple should know how asking for something in different ways can skew results. Siri has been demonstrating that quite effectively for years.
Score: 26 Votes (Like | Disagree)
applezulu Avatar
8 weeks ago

If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?
Much of it is just popular hype from people who don't know enough to know the difference. Think of the NY Times article that sort of kicked it all off in the popular media a couple of years ago. The writer seemed convinced that the AI was obsessing over him and actually asking him to leave his wife. The actual transcript for anyone who's seen this stuff back through the decades, showed the AI program bouncing off programmed parameters and being pushed by the writer into shallow territory where it lacked sufficient data to create logical interactions. The writer and most people reading it, however, thought the AI was being borderline sentient.

The simpler occam's razor explanation why AI businesses have rolled with that perception or at least haven't tried much to refute it, is that it provides cover for the LLM "learning" process that steals copyrighted intellectual property and then regurgitates it in whole or in collage form. The sheen of possible sentience clouds the theft ("people also learn by consuming the work of others") as well as the plagiarism ("people are influenced by the work of others, so what then constitutes originality?"). When it's made clear that LLM AI is merely hoovering, blending and regurgitating with no involvement of any sort of reasoning process, it becomes clear that the theft of intellectual property is just that: theft of intellectual property.
Score: 24 Votes (Like | Disagree)
Photoshopper Avatar
8 weeks ago
Why has no one else reported this? It took the “newcomer” Apple to figure it out and to tell the truth?
Score: 19 Votes (Like | Disagree)