Home AI Revolutionary Microsoft AI Model Accelerates Edge Device Reasoning by 10x

Revolutionary Microsoft AI Model Accelerates Edge Device Reasoning by 10x

0
New Microsoft AI Model Brings 10x Speed to Reasoning on Edge Devices, Apps

Microsoft has unveiled the Phi-4-mini-flash-reasoning model, a lightweight AI optimized for rapid on-device logical reasoning. Featuring 3.8 billion parameters, this model can handle a 64k token context length and is engineered for low-latency applications like mobile apps. Boasting up to 10 times the throughput and reduced latency compared to previous versions, Phi-4-mini-flash-reasoning employs a unique “decoder-hybrid-decoder” architecture called SambaY, integrating state-space models and a Gated Memory Unit (GMU). This combination enhances inference efficiency, making it suitable for single GPU use and real-time applications, such as tutoring tools. Microsoft emphasizes its commitment to responsible AI through mechanisms like supervised fine-tuning and human feedback reinforcement. The model is available via Azure AI Foundry, Hugging Face, and the NVIDIA API. Additionally, Hugging Face’s SmolLM3, a competitive 3B-parameter model, showcases advancements in small language models that improve on-device AI performance.

Source link

NO COMMENTS

Exit mobile version