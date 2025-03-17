Mac Studio With M3 Ultra Runs Massive DeepSeek R1 AI Model Locally

by

YouTuber Dave Lee of Dave2D fame has demonstrated how Apple's new Mac Studio equipped with an M3 Ultra chip can efficiently run a huge version of the DeepSeek R1 AI model locally, provided that users spec the machine with the maximum 512GB of memory.

Mac Studio 2025
According to Lee's testing, the 671 billion parameter AI model can be executed directly on Apple's high-end workstation, but it requires substantial memory resources, consuming 404GB of storage and requiring the manual allocation of 448GB of virtual RAM through Terminal commands.

The M3 Ultra's unified memory architecture is key to this performance, allowing the system to handle a 4-bit quantized version of DeepSeek R1 efficiently. The quantization slightly reduces accuracy, but it maintains all parameters and delivers approximately 17-18 tokens per second, which is sufficient for many practical applications.

Perhaps most impressively, the Mac Studio accomplishes this while consuming under 200 watts of power. Comparable performance on traditional PC hardware would require multiple GPUs drawing approximately ten times more electricity.

The capability to run such advanced AI models locally offers privacy advantages for sensitive applications like healthcare data analysis, where sending information to cloud services raises security concerns.


However, this performance doesn't come cheap – a Mac Studio configured with M3 Ultra and 512GB of RAM starts at around $10,000. Fully maxed out, an M3 Ultra Mac Studio with 16TB of SSD storage and an Apple M3 Ultra chip with 32-core CPU, 80-core GPU, and 32-core Neural Engine costs a cool $14,099. Of course, for organizations requiring local AI processing of sensitive data, the Mac Studio offers a relatively power-efficient solution compared to alternative hardware configurations.

Apple says the M3 Ultra is the fastest Mac chip it has ever released, thanks to its strategy of fusing two M3 Max chips together using the company's "UltraFusion" technology. This makes the chip's specs double that of the M3 Max.

Top Rated Comments

FSMBP Avatar
FSMBP
37 minutes ago at 06:39 am
Cool - but how fast can it load MacRumors.com in Safari??? I want real-world cases for myself before I plunk down $15K.
Score: 4 Votes (Like | Disagree)
surfzen21 Avatar
surfzen21
25 minutes ago at 06:50 am
If LLMs are a significant part of the future of computing and privacy is going to be a huge part of that, then Apple has a huge advantage from a hardware perspective. This is the real unspoken hero of what Apple is doing.

While everyone is focused on a delayed "AI" end user roll out, and some absolutely losing their stuff over it, Apple is created the hardware that is blowing away all the competition. Once the software side catches up, Apple will be lightyears ahead.

Keep in mind big players like META and Google are absolutely pirating any data they can get their hands on. Unfortunately, they are too big to fail like thepiratebay is.

Try to buy a Nvidia 5090 with a measly 32GB of Vram that needs a disgusting amount of power to run. The fake MSRP is $2,000 and are being sold on eBay for $7,000.

Will all the crying going on I think Apple is doing this exactly right.
Score: 4 Votes (Like | Disagree)
AusMness Avatar
AusMness
34 minutes ago at 06:41 am
This new Mac Studio is the king of local LLMs
Score: 3 Votes (Like | Disagree)
bigboy29 Avatar
bigboy29
22 minutes ago at 06:54 am

No smartphone will have 512GB of RAM in 10 years. At least not from Apple, if we can go by history. Unless 32GB of iPhone RAM is analogous to 512GB of Mac Studio RAM
While that is most likely true, it is also likely true that today's LLM models will be long forgotten as inefficient 10 years from now. I expect that the model efficiency and memory requirements will meet somewhere in the middle.
Score: 3 Votes (Like | Disagree)
bunce66 Avatar
bunce66
43 minutes ago at 06:32 am
Would it be safe to say that in 5-10 years a smartphone will be able to run a model like this internally and without the internet?
Score: 2 Votes (Like | Disagree)
neuropsychguy Avatar
neuropsychguy
29 minutes ago at 06:46 am
"Perhaps most impressively, the Mac Studio accomplishes this while consuming under 200 watts of power. Comparable performance on traditional PC hardware would require multiple GPUs drawing approximately ten times more electricity."

Assuming 1 hour of LLM use per day at $0.18 per kWh, that's about $11 per month to run a "comparable" PC versus about $1 for the Mac Studio. The Mac Studio is cheap to run.

That was just an estimation in the video about how much more electricity it would use.

The reality is worse for the non-Mac solution, not even counting the fact that you'd need many GPU to balance out the available RAM on the Mac Studio. You'd need 16 5090 GPUs to get 512 GB of RAM. Each of those idles at maybe 50 W. Let's say 400 W under load. Add in the draw of the CPU and more.

Using that estimate 1 hour of LLM use per day would be about $40 per month in electricity costs versus $1 for the Mac Studio.
Score: 1 Votes (Like | Disagree)
Read All Comments