iPhone 17 Pro Achieves Milestone by Running 400B Large Language Model, Demanding at Least 200GB of Compressed Memory

Apple’s iPhone 17 Pro has achieved the remarkable feat of running a 400 billion parameter Large Language Model (LLM), contrary to expectations of its hardware limitations. While conventional wisdom suggests that such models require a whopping 200GB of RAM, an innovative approach using the Flash-MoE open-source project allows the iPhone to stream data from its SSD to the GPU, bypassing memory constraints. However, the performance is notably sluggish, generating just 0.6 tokens per second, translating to one word every 1.5 to 2 seconds. Although users may find this frustrating, the demonstration reveals the potential for on-device LLMs on smartphones. Notably, this approach ensures 100% privacy and operates without an internet connection, albeit at the expense of battery life. In summary, while running a 400B LLM on an iPhone 17 Pro is feasible, significant optimizations are necessary for practical usage.

For more tech insights, follow Wccftech on Google.

Source link

News

Company:

Join our community of SUBSCRIBERS and be part of the conversation.

Operto Unveils GEO Consultant: The First Free AI Visibility Tool for the Hotel Industry – Hotel Magazine

BitGo Unveils MCP Server: Elevating Institutional Crypto Infrastructure for AI Agents – Business Wire

Safeguarding Autonomous AI: The Role of Agent-Ready Data in Risk Mitigation

Exploring AI Services Offered by Crypto Companies: A Focus on Binance

Microsoft-Backed OpenAI to Introduce Ads for Free Users of ChatGPT in the US

AI’s Impact on Offshore Employment: No Immediate Threat Detected

New Report Raises Alarms on AI Chatbots Fueling Violence Against Women and Girls

An Open Letter to Georgetown Students: Addressing the Impact of Generative AI

Creating an AI Receptionist for a High-End Auto Repair Shop

Transforming Low-Code Development in the Era of AI: Insights from Maranics

iPhone 17 Pro Achieves Milestone by Running 400B Large Language Model, Demanding at Least 200GB of Compressed Memory

Could Elon Musk Have Already Created a Decentralized AI Network for Truth and Global Impact?

Reddit Investigates Human Verification Tools to Combat AI-Generated Content Quality Issues

Is Your AI Agent Underperforming? The Real Issue Might Be Your Hesitation to “Burn” Tokens – 深潮TechFlow

OpenAI Engaged in Advanced Negotiations to Purchase Power from Sam Altman-Backed Fusion Startup Helion Energy – Axios

AI-Assisted Innovation: Man Creates Cancer Vaccine for His Dog

Local News

Operto Unveils GEO Consultant: The First Free AI Visibility Tool for the Hotel Industry – Hotel Magazine

AI’s Impact on Offshore Employment: No Immediate Threat Detected

BitGo Unveils MCP Server: Elevating Institutional Crypto Infrastructure for AI Agents – Business Wire

New Report Raises Alarms on AI Chatbots Fueling Violence Against Women and Girls

Operto Unveils GEO Consultant: The First Free AI Visibility Tool for the Hotel Industry – Hotel Magazine

AI’s Impact on Offshore Employment: No Immediate Threat Detected

BitGo Unveils MCP Server: Elevating Institutional Crypto Infrastructure for AI Agents – Business Wire