Apple Publishes Details About New 'MM1' AI Model
Apple researchers have developed a new method for training large language models (LLMs) that seamlessly integrates both text and visual information.
The company's findings, detailed in a research paper titled "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training," showcase a new approach to creating more intelligent and flexible AI systems. By utilizing a diverse dataset comprising image-caption pairs, interleaved image-text documents, and text-only data, Apple's claims that the MM1 model sets a new standard in AI's ability to perform tasks such as image captioning, visual question answering, and natural language inference with a high level of accuracy.
Apple's research focuses on the combination of different types of training data and model architectures, which enables the AI to understand and generate language based on a mix of visual and linguistic cues. This capability is vital for tasks that require a nuanced comprehension of the world, such as interpreting complex images or answering questions that involve visual elements.
The paper also highlights the MM1 model's exceptional in-context learning abilities, particularly in the largest 30 billion parameter configuration of the model. This version apparently exhibits remarkable capabilities for multi-step reasoning over multiple images using few-shot "chain-of-thought" prompting, a technique that allows the AI to perform complex, open-ended problem solving based on minimal examples.
This research emerges as part of Apple's broader initiative to enhance its AI capabilities amid growing competition. Earlier today, Bloomberg's Mark Gurman reported that Apple is in discussions with Google to license Google's Gemini generative large-language models to power new features coming to the iPhone as part of iOS 18.
Popular Stories
Apple has announced it will be holding a special event on Tuesday, May 7 at 7 a.m. Pacific Time (10 a.m. Eastern Time), with a live stream to be available on Apple.com and on YouTube as usual. The event invitation has a tagline of "Let Loose" and shows an artistic render of an Apple Pencil, suggesting that iPads will be a focus of the event. Subscribe to the MacRumors YouTube channel for more ...
Apple today released several open source large language models (LLMs) that are designed to run on-device rather than through cloud servers. Called OpenELM (Open-source Efficient Language Models), the LLMs are available on the Hugging Face Hub, a community for sharing AI code. As outlined in a white paper [PDF], there are eight total OpenELM models, four of which were pre-trained using the...
Apple is set to unveil iOS 18 during its WWDC keynote on June 10, so the software update is a little over six weeks away from being announced. Below, we recap rumored features and changes planned for the iPhone with iOS 18. iOS 18 will reportedly be the "biggest" update in the iPhone's history, with new ChatGPT-inspired generative AI features, a more customizable Home Screen, and much more....
Apple has dropped the number of Vision Pro units that it plans to ship in 2024, going from an expected 700 to 800k units to just 400k to 450k units, according to Apple analyst Ming-Chi Kuo. Orders have been scaled back before the Vision Pro has launched in markets outside of the United States, which Kuo says is a sign that demand in the U.S. has "fallen sharply beyond expectations." As a...
Apple is finally planning a Calculator app for the iPad, over 14 years after launching the device, according to a source familiar with the matter. iPadOS 18 will include a built-in Calculator app for all iPad models that are compatible with the software update, which is expected to be unveiled during the opening keynote of Apple's annual developers conference WWDC on June 10. AppleInsider...
Top Rated Comments
Apple hasn't become who they are with such a vast treasure trove of expertise and wealth by sheer dumb luck.