Wednesday, July 9, 2025

Hugging Face Unveils Enhanced Small Language Model with Advanced Reasoning Abilities

Share

Hugging Face has unveiled SmolLM3, a cutting-edge 3 billion parameter language model featuring long-context reasoning, multilingual support, and dual-mode inference. Available under the Apache 2.0 license, SmolLM3 excels with 11.2 trillion tokens in training, surpassing competitors like Llama-3.2-3B and Qwen2.5-3B, and challenging larger models such as Gemma3 and Qwen3.

The model proficiently accommodates six languages—English, French, Spanish, German, Italian, and Portuguese—and handles context lengths up to 128k tokens using NoPE and YaRN techniques. It includes both a base and an instruction-tuned version, allowing users to toggle reasoning modes.

SmolLM3’s robust training involved web, code, and math datasets, optimizing its performance with methods like Anchored Preference Optimization (APO). Ranking highly across 12 benchmarks, its capabilities span multilingual tasks and coding, further enhanced by public sharing of the training process on GitHub. Following SmolLM2’s success, Hugging Face continues to focus on iterative improvements in its small language models.

Source link

Read more

Local News