Optimization, Control and Reinforcement Learning

2-5 PM, February 15

Optimization serves as the cornerstone of contemporary computational strategies, providing the essential foundation for addressing intricate challenges. Together with the theoretical foundations and invaluable insights offered by control theory, and the modern data-driven approaches of Reinforcement Learning, these fields have nurtured and will continue to nurture innovation, both within and beyond their boundaries. Their confluence also holds the potential for the emergence of new fields and fresh directions for research.

Keynote Speaker – Dr Eric Mazumdar, CalTech

“Function Approximation in Strategic Environments: Scaling Laws and Sample Complexity“

Time: 2-2:50 PM, Feburary 15

Abstract: Machine learning (ML) algorithms are increasingly being deployed into environments in which they must interact with other strategic agents with potentially misaligned objectives. The presence of these other agents gives rise to non-stationary environments and can break many of the underlying assumptions upon which machine learning algorithms are built. To deal with this, we need to design new algorithms for learning in the presence of strategic agents.

Towards this goal, in this talk, I will focus on multi-agent reinforcement learning and Markov games as a model of a strategic environment. I will present new work that demonstrates how strategic interactions can result in scaling laws that depart from our conventional ML intuition that larger models and more data improve performance. In the second part of the talk, I will present work on algorithm design for strategic interactions, with a particular focus on designing provable algorithms for competitive RL problems. In particular, I will show how a small change to the well-known DQN algorithm from deep reinforcement learning yields an algorithm for learning in zero-sum Markov games that (1). incorporates function approximation and (2). has finite-time last-iterate convergence guarantees.

Biography: Eric Mazumdar is an Assistant Professor in Computing and Mathematical Sciences and Economics at Caltech. His research lies at the intersection of machine learning and economics where he is broadly interested in developing the tools and understanding necessary to confidently deploy machine learning algorithms into societal-scale systems. Eric is the recipient of an NSF Career Award and was a fellow at the Simons Institute for Theoretical Computer Science for the semester on Learning in Games. He obtained his Ph.D. in Electrical Engineering and Computer Science at UC Berkeley and received his B.S. in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology (MIT).

Keynote Speaker Q&A: 2:50-3 PM, Feburary 15

Invited Student Speaker – Gokul Swamy, Carnegie Mell on University

“Fast Algorithms for Inverse Reinforcement Learning“

Time: 3:10-3:40 PM, Feburary 15

Abstract: Interactive approaches to imitation learning like inverse reinforcement learning (IRL) have become the preferred approach for problems that range from autonomous driving to mapping. Despite its impressive empirical performance, robustness to compounding errors and causal confounders, and sample efficiency, IRL comes with a strong computational burden: the requirement to repeatedly solve a reinforcement learning (RL) problem in the inner loop. If we pause and take a step back, this is rather odd: we’ve reduced the easier problem of imitation to the harder problem of RL. In this talk, we will discuss a new paradigm for IRL that leverages a more informed reduction to expert competitive RL (rather than to globally optimal RL), allowing us to provide strong guarantees at a lower computational cost. Specifically, we will present a trifecta of efficient algorithms for IRL that use information from the expert demonstrations during RL to curtail unnecessary exploration, allowing us to dramatically speed up the overall procedure, both in theory and practice.

Biography: Gokul Swamy is a PhD candidate in the Robotics Institute at Carnegie Mellon University working on efficient interactive learning with unobserved confounders. Gokul works with Drew Bagnell, Steven Wu, and Sanjiban Choudhury. He completed an M.S. at UC Berkeley with Anca Dragan on Learning with Humans in the Loop. He has spent summers working on ML @ SpaceX, Autonomous Vehicles @ NVIDIA, Motion Planning @ Aurora, and Research @ Microsoft and @ Google.

Invited Student Speaker Q&A: 3:40-3:50 PM, February 15

Student Speakers

Time: 4-5 PM, Feburary 15

Yashaswini Murthy

“Performance Bounds for Policy-Based Average Reward Reinforcement Learning Algorithms“

Kristina Miller

“Correct-by-construction controller synthesis for nonlinear models with ω-regular specifications“

Olivier Massicot

“Almost-Bayesian quadratic persuasion with a scalar prior”

Minjun Sung

“Robust Model Based Reinforcement Learning using L1 Adaptive Control”