2:00 PM – 5:00 PM, February 24, CSL B02
The rapid evolution of machine learning and generative AI is reshaping how we process, understand, and create information from complex signals and data. From foundation models that can reason across modalities to sophisticated signal processing techniques that extract meaning from noisy environments, the boundaries of what’s computationally possible continue to expand. This session brings together research spanning the theoretical foundations of learning algorithms, generative modeling, and signal processing with their diverse applications. Topics include (but are not limited to) deep learning theory and optimization, large language models and multimodal systems, computer vision, speech and audio processing, computational imaging, time-series analysis, and the intersection of classical signal processing with modern neural approaches. Whether you’re interested in understanding why transformers generalize, how diffusion models learn to generate, or what principles guide the design of robust perception systems, we invite you to join us as we explore the cutting edge of intelligent systems that learn from and interact with the world around us.

Keynote Speaker – Jia-Bin Huang, University of Maryland
Time: 2:00 – 3:00 PM
“Controllable Visual Imagination”
Abstract: Generative models have empowered human creators to visualize their imaginations without artistic skills and labor. A prominent example is large-scale text-to-image/video generation models. However, these models are often difficult to control and do not respect 3D perspective geometry and the temporal consistency of videos. In this talk, I will showcase several of our recent efforts to improve controllability for visual imagination. Specifically, I will discuss how we enable semantic and spatial control for 2D image generation, facilitate layered decompositions for video editing, and synthesize object and camera motions from monocular videos.
Biography: Jia-Bin Huang is a Capital One-endowed Associate Professor of Computer Science at the University of Maryland, College Park. Before coming to UMD, Huang was a research scientist at Meta Reality Labs and an Assistant Professor of Electrical and Computer Engineering at Virginia Tech. Huang received his Ph.D. from the University of Illinois, Urbana-Champaign (UIUC) in 2016. His research interests include 3D computer vision, generative models, and computational photography. Huang is the recipient of the Thomas & Margaret Huang Award, NSF CRII award, faculty award from Samsung, Google, 3M, Qualcomm, Meta, and a Google Research Scholar Award.

Industry Speaker – Zhuangzhuang Ding, XPENG
Time: 3:00 – 3:30 PM
“The Architectural Necessity of Imagination: A Dual-Track World Model Approach to Autonomous Driving”
Abstract: Moravec’s paradox remains a fundamental challenge in autonomous driving: high-level reasoning requires relatively little computation, whereas low-level sensorimotor skills demand massive computational resources. We argue that world modeling serves as the fundamental architecture to bridge this gap: it functions as a predictive core for the agent’s proactive planning and a generative engine for large-scale, closed-loop environmental simulation. We propose a dual-track world model consisting of two components: a predictive world model that forecasts how the world evolves before the driving policy takes action, and a generative world model that synthesizes diverse driving scenarios beyond those seen in the real world. The predictive world model helps the agent build an architectural closed loop between driving policy networks and coarse-grained environment understanding, while the generative world model provides wide controllability to generate fine-grained environments, which could greatly address the long-tail distribution in rare corner cases and enabling robust evaluation. Together, the intertwinement of the agent (predictive) and environment (generative) world models elevates computer vision-based sensorimotor skills to a new level and paves the way toward truly scalable autonomous driving.
Biography: Zhuangzhuang Ding is a Senior Staff Software Engineer at XPENG, currently a member of the World Model team. His research focuses on autonomous driving technologies, including world models, vision-language-action (VLA) models, large language models (LLMs), 3D reconstruction, and perception.
Prior to joining XPENG, he worked at Cruise LLC and Horizon Robotics. He won 1st Place in the Waymo Open Dataset Challenge for two consecutive years (2020 and 2021).

Invited Student Speaker – Yash Savani, Carnegie Mellon University
Time: 3:30 – 4:15 PM
“Antidistillation Sampling: Protecting Reasoning Models from Capability Theft”
Abstract: As frontier AI models demonstrate increasingly sophisticated reasoning capabilities, a critical security vulnerability has emerged: distillation attacks can extract these expensive capabilities into smaller, unauthorized models. This talk presents antidistillation sampling, a method that preserves model performance on legitimate queries while provably limiting the effectiveness of distillation attacks. I’ll discuss the theoretical foundations connecting sampling strategies to model extractability, practical implementations that maintain reasoning quality, and implications for deploying secure AI systems. The work addresses a fundamental tension in AI deployment: making models useful while preventing unauthorized capability replication.
Biography: Yash Savani is a fifth-year PhD candidate in Computer Science at Carnegie Mellon University, advised by Prof. Zico Kolter. His research focuses on differentiable steering of foundation models, combining differential geometry, optimal control theory, and stochastic differential equations to control LLMs and diffusion models. His work on antidistillation sampling addresses critical security vulnerabilities in AI reasoning systems. He has published at venues including NeurIPS and ICML, and previously worked on early transformer architectures at Primer.AI and neural architecture search at Abacus.AI.
Student Presentations
Time: 4:15 – 5:00 PM
Vignesh Sundaresha: “An Efficient Test-Time Scaling Approach for Image Generation”
Priyanka Kargupta: “Cognitive Foundations for Reasoning and Their Manifestation in LLMs”
Aniket Vashishtha: “Executable Counterfactuals: Improving LLMs’ Causal Reasoning through Code“