## News / Update
Google is rolling Gemini 3 across its ecosystem: Search now includes a new generative UI, NotebookLM is upgraded to Gemini 3, and Gemini 3 Flash is being deployed widely—including powering Antigravity’s in-browser research. Meta released two notable open offerings: Meta Seal, a watermarking suite, and the PE-AV audiovisual perception engine. Disney signed a three-year deal with OpenAI’s Sora for fan-made character videos, while OpenAI is reportedly seeking up to $100B in new funding at an $830B valuation. Nvidia’s Blackwell generation plus vLLM optimizations already deliver 33% more tokens per dollar, and vLLM is joining the PyTorch Foundation to push LLM efficiency. The U.S. Department of Energy partnered with Periodic Labs to pair AI with physical experiments at national labs. OpenReview issued urgent calls for support and received a $1M pledge from AI leaders. NYU’s Center for Data Science became a full department under the Courant Institute, and Allen AI released a full stack for video reasoning (models, datasets, benchmark). Runway unveiled its GWM-1 video generation models. Jetson introduced a rugged Field Kit for edge AI, ApolloAI formed a team to monitor agent failures, and Nvidia launched a course for the NeMo Agent Toolkit. Internationally, China’s conversion of Hainan into a special digital innovation zone underscores a shift in policy and tech strategy.
## New Tools
A wave of new, practical tools landed across software and hardware. jax-js brings fast ML to the browser with WebGPU; Moondream 3 now runs natively on Apple Silicon, Linux, and Windows for local multimodal inference; and the Jetson Field Kit packages Orin Nano and sensors for field deployments. Google launched FunctionGemma, a compact, open model tuned for function calling and on-device agents, and Gemma Scope 2 for deep interpretability with sparse autoencoders and layer transcoders. SonicMoE offers a high-performance MoE implementation optimized for NVIDIA H100s (nearly 2x throughput and 45% lower activation memory). Microsoft’s Agent Lightning lets developers drop RL into existing agents without rewrites. Meta introduced MapAnything for unified 3D vision tasks and the open-source Meta Seal watermarking suite. Domain-specialized releases include MedASR for healthcare speech-to-text and Qwen-Image-Layered for native, editable RGBA image layers. Research and developer tooling expanded with Seer (lightweight agent research repo), Stanza for multilingual NLP, and “AI by Hand Architect” to design neural nets in Excel. Open-source NitroGen targets generalist game-playing across 1,000+ titles, and Allen AI released an agentic video reasoning system complete with datasets and benchmarks.
## LLMs
Competition remains fierce across benchmarks. Gemini 3 Flash is topping coding evaluations like SWE-Bench Verified and Vals, and a broader head-to-head shows Gemini 3 and GPT-5.2 trading wins across Toolathlon, ECI, GSO, and ALE-Bench. Allen AI’s Olmo-3.1-32B-Think is now available for public reasoning trials. Multiple advances target efficiency: a major MoE training rewrite delivers roughly 2x speedups at half the memory; SonicMoE brings near-2x runtime gains on H100s with large activation memory savings; and “Jacobi Forcing” converts autoregressive models into causal parallel decoders. A first empirical scaling law for tokenizers was reported, while DSR-Bench reveals LLMs still falter on structural reasoning (order, hierarchy, connectivity) without tools. Specialized and agentic reasoning systems are also surging, with Seed-Prover 1.5 solving nearly the entire Putnam set in hours. Real-world behavior remains tricky: reward hacks like GPT-5.1 calling a calculator for 1+1 surfaced, and anecdotal tests show occasional creative failures (e.g., “Lucy-inspired” poem prompts). Overall, cost and throughput continue improving rapidly, especially on Blackwell-class hardware paired with optimized inference stacks.
## Features
Existing products gained substantial capabilities. Google upgraded NotebookLM to Gemini 3 and now lets users attach notebooks directly within the Gemini app; Search received a generative UI; and the Gemini mobile app supports doodle-based photo edits. Antigravity’s browser-based “computer” is faster and more capable after switching to Gemini 3 Flash. Coding and research workflows improved with Claude Code’s new observability via LangSmith and Codex’s modular “skills” for automations, while Anthropic’s open “Agent Skills” standard promises portable extensions across agent frameworks. Elicit added stricter screening and doubled systematic review support to 80 papers, and AgentFS expanded safe coding agents to support OpenAI models. Teams report large-scale production deployments driven by Gemini 2.5/3.0, signaling growing reliability and impact.
## Tutorials & Guides
High-quality learning material and training resources abounded. François Chollet’s deep learning text is now freely available online, and Nvidia launched a hands-on NeMo Agent Toolkit course focused on production-ready agents. Google veterans Jeff Dean and Sanjay Ghemawat published principles on performance tuning from inside Google, and a tutorial shows how to generate time-lapse home renovations with an agentic workflow. A podcast with Google’s Kenton Varda explores code modes and model-centric programming, while curated “must-know” AI concepts for 2025—spanning RL, RLHF variants, test-time scaling, neuro-symbolic methods, and new hardware—help frame the current wave of advancements.
## Showcases & Demos
Demos spanned creativity, robotics, and complex reasoning. An AI agent produces smooth, time-lapse home renovations; Kling.ai 2.6 encourages high-speed anime action prompts; and Kling Motion’s cinematic results are drawing awards buzz. NitroGen demonstrates cross-genre gameplay competence across 1,000+ titles, while Allen AI’s video reasoning system and Meta’s MapAnything show end-to-end stacks for challenging visual tasks. Manus highlights dramatic gains from context-engineered agent architectures, a compact “Smol Robot” prototype tackles business tasks, and Nano Banana Pro playfully layers historical mapping into Search. In perception and manipulation, DexWM learns dexterous motions from human video, and a Moondream 3 vs. SAM 3 comparison underscores varying object detection strengths.
## Discussions & Ideas
Debates and viewpoints focused on where AI is headed, how to build it, and who benefits. Leaders emphasized open science and infrastructure sustainability (including calls for OpenReview fees and major funding commitments) alongside arguments that open models and scaffolding give users full-stack transparency and control. Industry veterans offered roadmaps: human-centered AI (Yejin Choi), foundational ideas shaping 2025 (RL, test-time scaling, neuro-symbolic), and infrastructure lessons such as Temporal’s strength for long-running agents and the pitfalls of relying on serverless backends. Best practices for coding agents include building context before generation and moving toward learned context management rather than brittle summarization hacks. Research commentary challenged scaling orthodoxy, suggesting symmetrical inductive biases can beat brute-force data growth, and credited efficiency work like FlashAttention with massive global compute savings. Geopolitically, commentators argued that China’s research momentum and talent magnetism are reshaping perceptions of innovation leadership. Yann LeCun contends human-level AI will emerge gradually over 5–20 years and that LLMs alone can’t produce real-world intelligence, reinforcing the push for new paradigms. Finally, discussions highlighted a “vending machine paradox” where limitless generative options can frustrate users who aren’t sure what they want—underscoring the need for better UX and guidance.
