Recent studies reveal that a small fraction of Large Language Model (LLM) parameters, referred to as “super weights,” are crucial for model performance. Remarkably, even pruning one of these super weights can significantly degrade an LLM’s text generation capabilities, increasing perplexity by three orders of magnitude and dropping zero-shot accuracy to random guessing. This research introduces a unique, data-free technique to identify super weights via a single forward pass through the model. These weights correspond to rare, large activation outliers, termed “super activations,” which, when preserved accurately, enhance quantization, making it competitive with leading methods. Furthermore, the study shows that maintaining super weights and managing other outlier parameters can enable larger block sizes in weight quantization. To support ongoing research, the study provides an index of super weight coordinates for commonly used LLMs. This discovery highlights the critical role of specific parameters in the effectiveness of LLMs and advancements in model optimization.
Source link

Share
Read more