Apple Researchers Reveal New AI System That Can Beat GPT-4

Apple researchers have developed an artificial intelligence system named ReALM (Reference Resolution as Language Modeling) that aims to radically enhance how voice assistants understand and respond to commands.

hey siri banner apple
In a research paper (via VentureBeat), Apple outlines a new system for how large language models tackle reference resolution, which involves deciphering ambiguous references to on-screen entities, as well as understanding conversational and background context. As a result, ReALM could lead to more intuitive and natural interactions with devices.

Reference resolution is an important part of natural language understanding, enabling users to use pronouns and other indirect references in conversation without confusion. For digital assistants, this capability has historically been a significant challenge, limited by the need to interpret a wide range of verbal cues and visual information. Apple's ReALM system seeks to address this by converting the complex process of reference resolution into a pure language modeling problem. In doing so, it can comprehend references to visual elements displayed on a screen and integrate this understanding into the conversational flow.

ReALM reconstructs the visual layout of a screen using textual representations. This involves parsing on-screen entities and their locations to generate a textual format that captures the screen's content and structure. Apple researchers found that this strategy, combined with specific fine-tuning of language models for reference resolution tasks, significantly outperforms traditional methods, including the capabilities of OpenAI's GPT-4.

ReALM could enable users to interact with digital assistants much more efficiently with reference to what is currently displayed on their screen without the need for precise, detailed instructions. This has the potential to make voice assistants much more useful in a variety of settings, such as helping drivers navigate infotainment systems while driving or assisting users with disabilities by providing an easier and more accurate means of indirect interaction.

Apple has now published several AI research papers. Last month, the company revealed a new method for training large language models that seamlessly integrates both text and visual information. Apple is widely expected to unveil an array of AI features at WWDC in June.

Popular Stories

iPhone SE 4 Vertical Camera Feature

iPhone SE 4 Rumored to Use Same Rear Chassis as iPhone 16

Friday July 19, 2024 7:16 am PDT by
Apple will adopt the same rear chassis manufacturing process for the iPhone SE 4 that it is using for the upcoming standard iPhone 16, claims a new rumor coming out of China. According to the Weibo-based leaker "Fixed Focus Digital," the backplate manufacturing process for the iPhone SE 4 is "exactly the same" as the standard model in Apple's upcoming iPhone 16 lineup, which is expected to...
iPhone 16 Pro Sizes Feature

iPhone 16 Series Is Just Two Months Away: Everything We Know

Monday July 15, 2024 4:44 am PDT by
Apple typically releases its new iPhone series around mid-September, which means we are about two months out from the launch of the iPhone 16. Like the iPhone 15 series, this year's lineup is expected to stick with four models – iPhone 16, iPhone 16 Plus, iPhone 16 Pro, and iPhone 16 Pro Max – although there are plenty of design differences and new features to take into account. To bring ...
iphone 14 lineup

Cellebrite Unable to Unlock iPhones on iOS 17.4 or Later, Leak Reveals

Thursday July 18, 2024 4:18 am PDT by
Israel-based mobile forensics company Cellebrite is unable to unlock iPhones running iOS 17.4 or later, according to leaked documents verified by 404 Media. The documents provide a rare glimpse into the capabilities of the company's mobile forensics tools and highlight the ongoing security improvements in Apple's latest devices. The leaked "Cellebrite iOS Support Matrix" obtained by 404 Media...
tinypod apple watch

TinyPod Turns Your Apple Watch Into an iPod

Wednesday July 17, 2024 3:18 pm PDT by
If you have an old Apple Watch and you're not sure what to do with it, a new product called TinyPod might be the answer. Priced at $79, the TinyPod is a silicone case with a built-in scroll wheel that houses the Apple Watch chassis. When an Apple Watch is placed inside the TinyPod, the click wheel on the case is able to be used to scroll through the Apple Watch interface. The feature works...
bsod

Crowdstrike Says Global IT Outage Impacting Windows PCs, But Mac and Linux Hosts Not Affected

Friday July 19, 2024 3:12 am PDT by
A widespread system failure is currently affecting numerous Windows devices globally, causing critical boot failures across various industries, including banks, rail networks, airlines, retailers, broadcasters, healthcare, and many more sectors. The issue, manifesting as a Blue Screen of Death (BSOD), is preventing computers from starting up properly and forcing them into continuous recovery...
New MacBook Pros Launching Tomorrow With These 4 New Features 2

M5 MacBook Models to Use New Compact Camera Module in 2025

Wednesday July 17, 2024 2:58 am PDT by
Apple in 2025 will take on a new compact camera module (CCM) supplier for future MacBook models powered by its next-generation M5 chip, according to Apple analyst Ming-Chi Kuo. Writing in his latest investor note on unny-opticals-2025-business-momentum-to-benefit-509819818c2a">Medium, Kuo said Apple will turn to Sunny Optical for the CCM in its M5 MacBooks. The Chinese optical lens company...

Top Rated Comments

HackMacDaddy Avatar
16 weeks ago
Can‘t wait for it to show me what it found on the web…
Score: 38 Votes (Like | Disagree)
truthsteve Avatar
16 weeks ago

enabling users to use pronouns and other indirect references in conversation without confusion.
oh boy

I'm going to stand on the sidelines to see what group A and group B says about this.
Score: 14 Votes (Like | Disagree)
magicschoolbus Avatar
16 weeks ago
Big claim from the same company that introduced Siri :rolleyes:
Score: 13 Votes (Like | Disagree)
Japan Ricardo Avatar
16 weeks ago

It's good if AI understands "Can you repeat that?" properly.

/thread
Me: Remind me about this later.
Siri: Tell me what you'd like to be reminded about.
Me: This.
Siri: Okay. I've added a reminder called 'this' to your reminders.
Score: 13 Votes (Like | Disagree)
aknabi Avatar
16 weeks ago
I assume anything their current research is talking about won't impact their offerings for several years and in the meantime they'll do what they did with outsourcing Maps until they got their solution "ready" (of course then there was the bumps until it was a competitive offering, which will likely be more so with AI)
Score: 9 Votes (Like | Disagree)
coffeemilktea Avatar
16 weeks ago
Does this mean SiriGPT won't rely on Google Gemini? Not only is Gemini behind its competitors like OpenAI's models or Anthropic's, but having less Google in Apple products is always a relief. ?
Score: 9 Votes (Like | Disagree)