Speakers

Christopher Manning / Stanford University

Language Models have been around for decades but have suddenly taken the world by storm. In a surprising third act for anyone doing NLP in the 70s, 80s, 90s, or 2000s, in much of the popular media, artificial intelligence is now synonymous with language models. In this talk, I want to take a look backward at where language models came from and why they were so slow to emerge, and a look forward at some topics of recent research with an emphasis on where a linguistic perspective still has relevance. I emphasize the importance of systematic generalization, which encourages still looking for alternative neural architectures, like Mixture-of-Experts Universal Transformers (MoEUTs) and models that have exploration, reasoning, and interaction, which encourages looking at tasks that interact with the world, such as web agents.
Speaker Bio: Christopher Manning is the inaugural Thomas M. Siebel Professor in Machine Learning in the Departments of Computer Science and Linguistics at Stanford University, a Senior Fellow at the Stanford Institute for Human-Centered Artificial Intelligence (HAI), and a General Partner at AIX Ventures. He was the Director of the Stanford Artificial Intelligence Laboratory (SAIL) 2018–2025. His research is on computers that can intelligently process, understand, and generate human languages. Chris is the most-cited researcher within Natural Language Processing (NLP), with best paper awards at the ACL, Coling, EMNLP, and CHI conferences and three consecutive ACL Test of Time awards for his early work on neural network or deep learning approaches to human language understanding, which led into modern Large Language Models and Generative AI. He is a member of the U.S. National Academy of Engineering and the American Academy of Arts and Sciences and recipient of the 2024 IEEE John von Neumann Medal and an honorary doctorate from U. Amsterdam. He founded the Stanford NLP group, has written widely used NLP textbooks, and teaches the popular NLP class CS224N, watched by hundreds of thousands online.
Nilou Salehi / Across.AI

will participate in a panel discussion on agentic AI. The discussion will be moderated by Nilou Salehi
Speaker Bio: Panel

We introduce MLX, an efficient and flexible array framework highly tuned for Apple silicon. MLX can be used for a wide variety of applications ranging from numerical simulations and scientific computing to machine learning. We will dive into the programming stack of MLX, with a particular interest in its unique features compared to other mainstream ML frameworks. We will show how MLX combined with a Mac with Apple silicon is becoming increasingly popular among researchers and developers interested in a consumer-level platform to experiment with the latest AI models and techniques, in particular with large language models.
Speaker Bio: Ronan Collobert is a Distinguished Research Scientist at Apple (Machine Learning Research), where he leads the MLX efforts, pushing ML innovation on Apple silicon. Previously, he was at Meta for eight years and NEC Labs for six years, interleaved with a short academic career of four years at the Idiap Research Institute in Switzerland. Ronan’s research interests are around deep learning in general, efficient algorithms in particular, with applications in natural language, image and speech processing. Ronan received several Test of Time awards for his early work on deep learning for NLP and speech processing. In addition, Ronan is a fervent supporter of open-source, and since the early 2000s he has led a number of popular machine learning projects, including Torch, Flashlight and more recently MLX.

From healthcare and finance to science and government, AI is touching every part of our world. But AI is not one size fits all—every organization needs models tuned to its own challenges, data, and tightly integrated into the systems it uses to solve problems. The key to unlocking that diversity is openness: sharing models, datasets, research, and techniques so we can innovate together. At NVIDIA, we are building the Nemotron datasets, techniques, and foundation models as part of a full-stack co-design effort, connecting GPUs, networking, systems software, and models to push the limits of both efficiency and performance. In this talk, we’ll explore why openness is essential to trustworthy AI, how full-stack co-design is reshaping the future of computing, and how collaboration across industry and research will accelerate the breakthroughs that define the next era of AI.
Speaker Bio: Bryan Catanzaro is Vice President of Applied Deep Learning Research at NVIDIA, where he helps lead the Nemotron team that builds NVIDIA’s open foundation models. Bryan helped create CUDNN, NVIDIA's first AI product; DLSS, the most widely deployed neural rendering system, which uses AI to make graphics 10X more compute efficient; and Megatron, which set speed records at scale for training large language models and forms the technical basis for many Generative AI projects around the industry. A strong advocate for open science, Bryan and his team contribute models, datasets, and techniques through Nemotron to ensure developers everywhere can build, customize, and safely deploy AI. With more than 45,000 academic citations, his research spans deep learning, computer graphics, and large-scale systems. Bryan received his PhD in Electrical Engineering and Computer Sciences from the University of California, Berkeley.
Mingqiu Wang / Google DeepMind

As audio interfaces move from novelty to necessity, generative models are beginning to transform how machines understand, produce, and reason over spoken language. This talk will explore the evolving landscape of audio in the AGI era. We will look at recent advances in contextual text-to-speech (TTS), and native audio dialogue systems that integrate speech, reasoning, and interaction seamlessly and their products in Gemini. On the research side, we will discuss post-training methodologies — including supervised fine-tuning (SFT), knowledge distillation, and reinforcement learning (RL) — as well as emerging techniques for integrating thinking, tool use, and multi-turn audio reasoning into conversational agents.
Speaker Bio: Mingqiu Wang is a Research Engineer at Google DeepMind. She currently leads the audio post-training in Gemini. She has co-led the launches of native audio generation models in Gemini, powering 1) real-time native audio dialog, 2) contextual speech synthesis, and 3) sophisticated audio understanding for products including Google Cloud (AI Studio and Vertex AI), Astra, Gemini Live, Gemini App, and NotebookLM, etc. She also leads the core research of advanced training recipes from SFT to distillation to RLHF, reasoning/thinking, and agentic/tool-using capabilities for audio. Before joining DeepMind, she worked in Google Brain as a key contributor to Bard and developed the foundational speech-to-X model that became the audio backbone for Gemini 1.0.