Apple Researchers Reveal New AI System That Can Beat GPT-4

Apple researchers have developed an artificial intelligence system named ReALM (Reference Resolution as Language Modeling) that aims to radically enhance how voice assistants understand and respond to commands.

hey siri banner apple
In a research paper (via VentureBeat), Apple outlines a new system for how large language models tackle reference resolution, which involves deciphering ambiguous references to on-screen entities, as well as understanding conversational and background context. As a result, ReALM could lead to more intuitive and natural interactions with devices.

Reference resolution is an important part of natural language understanding, enabling users to use pronouns and other indirect references in conversation without confusion. For digital assistants, this capability has historically been a significant challenge, limited by the need to interpret a wide range of verbal cues and visual information. Apple's ReALM system seeks to address this by converting the complex process of reference resolution into a pure language modeling problem. In doing so, it can comprehend references to visual elements displayed on a screen and integrate this understanding into the conversational flow.

ReALM reconstructs the visual layout of a screen using textual representations. This involves parsing on-screen entities and their locations to generate a textual format that captures the screen's content and structure. Apple researchers found that this strategy, combined with specific fine-tuning of language models for reference resolution tasks, significantly outperforms traditional methods, including the capabilities of OpenAI's GPT-4.

ReALM could enable users to interact with digital assistants much more efficiently with reference to what is currently displayed on their screen without the need for precise, detailed instructions. This has the potential to make voice assistants much more useful in a variety of settings, such as helping drivers navigate infotainment systems while driving or assisting users with disabilities by providing an easier and more accurate means of indirect interaction.

Apple has now published several AI research papers. Last month, the company revealed a new method for training large language models that seamlessly integrates both text and visual information. Apple is widely expected to unveil an array of AI features at WWDC in June.

Top Rated Comments

HackMacDaddy Avatar
3 weeks ago
Can‘t wait for it to show me what it found on the web…
Score: 38 Votes (Like | Disagree)
truthsteve Avatar
3 weeks ago

enabling users to use pronouns and other indirect references in conversation without confusion.
oh boy

I'm going to stand on the sidelines to see what group A and group B says about this.
Score: 14 Votes (Like | Disagree)
magicschoolbus Avatar
3 weeks ago
Big claim from the same company that introduced Siri :rolleyes:
Score: 13 Votes (Like | Disagree)
Japan Ricardo Avatar
3 weeks ago

It's good if AI understands "Can you repeat that?" properly.

/thread
Me: Remind me about this later.
Siri: Tell me what you'd like to be reminded about.
Me: This.
Siri: Okay. I've added a reminder called 'this' to your reminders.
Score: 13 Votes (Like | Disagree)
aknabi Avatar
3 weeks ago
I assume anything their current research is talking about won't impact their offerings for several years and in the meantime they'll do what they did with outsourcing Maps until they got their solution "ready" (of course then there was the bumps until it was a competitive offering, which will likely be more so with AI)
Score: 9 Votes (Like | Disagree)
coffeemilktea Avatar
3 weeks ago
Does this mean SiriGPT won't rely on Google Gemini? Not only is Gemini behind its competitors like OpenAI's models or Anthropic's, but having less Google in Apple products is always a relief. ?
Score: 9 Votes (Like | Disagree)

Popular Stories

Provenance Emulator

PlayStation, GameCube, Wii, and SEGA Emulator for iPhone and Apple TV Coming to App Store

Friday April 19, 2024 8:29 am PDT by
The lead developer of the multi-emulator app Provenance has told iMore that his team is working towards releasing the app on the App Store, but he did not provide a timeframe. Provenance is a frontend for many existing emulators, and it would allow iPhone and Apple TV users to emulate games released for a wide variety of classic game consoles, including the original PlayStation, GameCube, Wii,...
iPhone 15 Pro FineWoven

Apple Reportedly Stops Production of FineWoven Accessories

Sunday April 21, 2024 6:03 am PDT by
Apple has stopped production of FineWoven accessories, according to the Apple leaker and prototype collector known as "Kosutami." In a post on X (formerly Twitter), Kosutami explained that Apple has stopped production of FineWoven accessories due to its poor durability. The company may move to another non-leather material for its premium accessories in the future. Kosutami has revealed...
Delta Feature

Delta Game Emulator Now Available From App Store on iPhone

Wednesday April 17, 2024 9:58 am PDT by
Game emulator apps have come and gone since Apple announced App Store support for them on April 5, but now popular game emulator Delta from developer Riley Testut is available for download. Testut is known as the developer behind GBA4iOS, an open-source emulator that was available for a brief time more than a decade ago. GBA4iOS led to Delta, an emulator that has been available outside of...
iPad Air 12

12.9-Inch iPad Air Now Rumored to Feature Mini-LED Display

Thursday April 18, 2024 7:37 am PDT by
The rumored 12.9-inch iPad Air that is expected to be announced in May will be equipped with a mini-LED display like the current 12.9-inch iPad Pro, according to Ross Young, CEO of research firm Display Supply Chain Consultants. The existing 10.9-inch iPad Air is equipped with a standard LCD panel, and the move to mini-LED technology for the 12.9-inch model would provide increased brightness for...