Revolutionary Microsoft AI Model Accelerates Edge Device Reasoning by 10x

July 10, 2025

Microsoft has unveiled the Phi-4-mini-flash-reasoning model, a lightweight AI optimized for rapid on-device logical reasoning. Featuring 3.8 billion parameters, this model can handle a 64k token context length and is engineered for low-latency applications like mobile apps. Boasting up to 10 times the throughput and reduced latency compared to previous versions, Phi-4-mini-flash-reasoning employs a unique “decoder-hybrid-decoder” architecture called SambaY, integrating state-space models and a Gated Memory Unit (GMU). This combination enhances inference efficiency, making it suitable for single GPU use and real-time applications, such as tutoring tools. Microsoft emphasizes its commitment to responsible AI through mechanisms like supervised fine-tuning and human feedback reinforcement. The model is available via Azure AI Foundry, Hugging Face, and the NVIDIA API. Additionally, Hugging Face’s SmolLM3, a competitive 3B-parameter model, showcases advancements in small language models that improve on-device AI performance.

Source link

{{post_title}}

Revolutionary Microsoft AI Model Accelerates Edge Device Reasoning by 10x

NO COMMENTS

LEAVE A REPLY Cancel reply

Loading…

Here are the results for the search: "{{td_search_query}}"

No results!

{{post_title}}

RELATED ARTICLES

Attensi Highlights Workforce Skill Gaps as Key Challenge and Major Opportunity...

90% of Software Executives Advocate for Custom AI Agents in Their...

Essential Updates: Global Headlines, Breaking News, and Key Summaries

NO COMMENTS

LEAVE A REPLY Cancel reply