Home AI Revolutionizing Large Language Models: Innovative Approaches Unveiled | MIT News

Revolutionizing Large Language Models: Innovative Approaches Unveiled | MIT News

0
A new way to increase the capabilities of large language models | MIT News

Researchers at MIT and the MIT-IBM Watson AI Lab have introduced “PaTH Attention,” an innovative encoding technique designed to enhance the capabilities of transformers—key models in Large Language Models (LLMs). While traditional rotary position encoding (RoPE) assigns fixed rotations based on relative distances between words, PaTH Attention adapts positional information based on context, allowing the model to track state changes and relationships over time. This method employs small, data-dependent transformations—creating a “positional memory” that reflects input dynamics more accurately. Tests revealed that PaTH Attention outperformed existing techniques, improving efficiency and performance across various reasoning benchmarks and real-world tasks, including long-term context tracking. Furthermore, by integrating PaTH with the Forgetting Transformer (FoX), the model can selectively disregard less relevant information, mimicking human cognition. This work has the potential to enhance applications in structured fields, underscoring a significant advancement in AI architecture for improved expressivity and scalability.

Source link

NO COMMENTS

Exit mobile version