Leveraging LLMs for Enhanced Late Multimodal Sensor Fusion in Activity Recognition

In the paper accepted at the Learning from Time Series for Health workshop at NeurIPS 2025, we explore the power of large language models (LLMs) in late fusion for activity classification based on audio and motion time series data. We utilized a curated subset of the Ego4D dataset to recognize diverse activities, including household tasks and sports. Our evaluations demonstrated that the LLMs achieved impressive 12-class zero- and one-shot classification F1-scores, significantly surpassing chance levels without task-specific training. This zero-shot classification approach enables efficient LLM-based fusion from modality-specific models, facilitating multimodal temporal applications even when aligned training data is sparse. Furthermore, deploying LLM-based fusion enables developers to bypass additional memory and computational requirements typically needed for application-specific multimodal models. This research underscores the potential of LLMs in enhancing multimodal data integration and activity recognition, offering significant implications for health and various downstream applications.

Source link

News

Company:

Join our community of SUBSCRIBERS and be part of the conversation.

Top AI Ghost Mannequin Tools for E-Commerce in 2026: A Spotlight on Wearview

Unveiling the Overlooked Issues of AI Writing Tools: What You Need to Know

Top AI Ghost Mannequin Tools for E-commerce in 2026: How Wearview Empowers Fashion Brands to Cut Photography Expenses

Meta Postpones Avocado AI Model Launch, Sets Sights on Google’s Gemini

Chorus Intelligence Secures $20M to Enhance AI-Driven Investigation Solutions

Blame the Bearer: Insights from Jerusalem Demsas

Meta Plans Investment of Up to $27 Billion in Nebius AI Infrastructure

NanoClaw and Docker Collaborate to Secure AI Agents in MicroVM Sandboxes

Optimizing LLM Expenses for AI in Analyzing Production Alerts

Introducing .nope: A Comprehensive Project Pattern for Organizing AI-Generated Side Docs, Prompt Experiments, Agent Notes, and Temporary Markdown Artifacts – Available on GitHub

Leveraging LLMs for Enhanced Late Multimodal Sensor Fusion in Activity Recognition

How AI Tools Accelerate Ad Testing and Launch for CPG Brands Like Coca-Cola

Introducing the Groundbreaking Open-Source Agentic AI Physicist

Comprehensive AI Learning Materials and Guides from Anthropic

Unleashing the Power: Coding for the World’s Fastest and Largest AI Chip

Shopify CEO Leverages AI Tool Claude for MRI Scan Analysis – Moneycontrol.com

Local News

Blame the Bearer: Insights from Jerusalem Demsas

Top AI Ghost Mannequin Tools for E-Commerce in 2026: A Spotlight on Wearview

Meta Plans Investment of Up to $27 Billion in Nebius AI Infrastructure

Unveiling the Overlooked Issues of AI Writing Tools: What You Need to Know

Blame the Bearer: Insights from Jerusalem Demsas

Top AI Ghost Mannequin Tools for E-Commerce in 2026: A Spotlight on Wearview

Meta Plans Investment of Up to $27 Billion in Nebius AI Infrastructure