Unlocking On-Device GenAI with LiteRT-LM for Chrome, Chromebook Plus, and Pixel Watch
LiteRT-LM revolutionizes the deployment of large language models (LLMs) directly on user devices such as Chrome and Pixel Watch. This production-ready inference framework enhances user experiences through offline capability and cost efficiency, eliminating per-API-call charges. LiteRT-LM ensures high performance, achieving sub-second latency across various edge hardware while simplifying complex generative tasks. Developers can now utilize the MediaPipe LLM Inference API and have early access to its C++ interface for custom AI pipelines.
Noteworthy features include cross-platform compatibility, hardware acceleration, and modular design, enabling tailored deployment from powerful laptops to resource-constrained wearables. LiteRT-LM also effectively manages shared resources via its Engine and Session architecture, optimizing both binary size and memory usage.
Start building LLM-powered applications today by diving into the LiteRT HuggingFace community and GitHub repository to explore our extensive documentation and sample code. Empower your products with advanced, efficient AI technologies.