VeriSilicon has unveiled its ultra-low energy Neural Network Processing Unit (NPU) IP, designed for on-device inference of large language models (LLMs) with an impressive performance exceeding 40 TOPS. This energy-efficient architecture addresses the growing demand for generative AI capabilities on mobile platforms like AI phones and PCs while meeting stringent energy efficiency standards. The NPU supports mixed-precision computation, advanced sparsity optimization, and parallel processing, enhancing memory management and reducing latency for responsive AI applications. It is compatible with a range of AI algorithms and models, including Stable Diffusion and LLaMA-7B, and integrates seamlessly with other VeriSilicon processing IPs for comprehensive AI solutions. Furthermore, the NPU supports popular AI frameworks like TensorFlow Lite, ONNX, and PyTorch, facilitating ease of deployment. VeriSilicon’s commitment to ultra-low energy NPU development positions it as a pivotal player in the evolution of mobile devices into personal AI servers amid rapidly advancing AI technologies.
Source link
VeriSilicon’s Ultra-Low Energy NPU Delivers 40+ TOPS for On-Device LLM Inference in Mobile Applications

Leave a Comment
Leave a Comment