Machine Learning and Signal Processing

9 AM-12 PM, February 16

Advances in machine learning and signal processing have revolutionized capabilities in various fields, with innovations and applications emerging more and more frequently. We invite you to the Machine Learning and Signal Processing Session of the CSL student conference if you are curious about when, how, and why machine learning algorithms work. Besides the theoretical aspects of machine learning, this session covers topics including (but not limited to) computer vision, deep learning, acoustics, signal processing, etc.

Keynote Speaker – Dr David Fouhey, New York University

“From the Hands to the Sun: Using Computer Vision to Measure The Universe“

Time: 9-9:50 AM, February 16

Abstract: It’s exciting times as computer vision has started to change from a discipline with potential to a discipline with impact all around us. During this transition period, one opportunity I’m particularly excited about is collaborating with researchers in the sciences to help them obtain better data. Accordingly, I’ve been translating what I’ve learned from the past decade in vision to science domains. In this talk, I’ll show off two lines of work that I’ve built up with members of my research group and collaborators, one in a more traditional vision topic and the other outside what’s normally shown at CVPR. While studying objects of radically different sizes (about 10 orders of magnitude), both are unified by a desire to build systems that just work and produce outputs people and models want to use.

I’ll start off with our more everyday work on understanding human hands engaged in contact and discuss my progress towards systems that produce information about hands, their contacts, and their 3D shape. I’ll then turn to solar physics, where I’ve been leading work on building systems to obtain better measurements of the Sun’s vector magnetic field. These vector maps of the solar magnetic field serve as cornerstones for scientific investigation of the Sun as well as important space weather monitoring efforts. I’ll discuss how the field gets measured in practice and show our methods that combine capabilities of multiple instruments. The systems produce maps of the magnetic field with higher fidelity, fewer artifacts, and far higher throughput. As a byproduct, our efforts have resolved long-standing calibration issues as well.

This line of work is joint work with many, many researchers, but PhD student collaborators with whom these results are co-authored are Richard Higgins, Ruoyu Wang, and Dandan Shan.

Biography: David Fouhey an Assistant Professor at NYU, jointly appointed between Computer Science in the Courant Institute of Mathematical Sciences and Electrical and Computer Engineering in the Tandon School of Engineering. From 2019 to 2023, he was an Assistant Professor at CSE at the University of Michigan. Before that, he was a postdoctoral fellow at UC Berkeley, working with Alyosha Efros and Jitendra Malik. David received a Ph.D. in robotics from CMU where he worked with Abhinav Gupta and Martial Hebert. David works on learning-based computer vision and is interested in: Understanding 3D from pictorial cues, Understanding the Interactive World and Measurement systems for basic sciences, especially solar physics.

Keynote Speaker Q&A: 9:50-10 AM, February 16

Invited Student Speaker – Zhiqing Sun, Carnegie Mellon University

“Scalable Alignment of Large Language and Multimodal Models“

Time: 10:10-10:40 AM, February 16

Abstract: There has been an increasing desire to bridge the gap between what we expect AI systems to generate and what they actually produce. At the forefront of this exploration, we introduce novel methodologies for aligning Large Language Models (LLMs) and Large Multimodal Models (LMMs): Principle-Driven Self-Alignment (SELF-ALIGN), Self-Alignment with Principle-Following Reward Models (SALT), and Aligning Large Multimodal Models with Factually-Augmented RLHF (Fact-RLHF). Driven by the aspiration to diminish the constraints of exhaustive human supervision and to magnify the reliability of AI outputs, SELF-ALIGN ingeniously harnesses principle-driven reasoning with LLMs’ generative prowess, crafting content that resonates with human values. Building on this motivation, SALT evolves the alignment landscape further by seamlessly integrating minimal human guidance with reinforcement learning from synthetic preferences, offering a tantalizing glimpse into the future of self-aligned AI agents. Shifting our lens to the multimodal realm, where misalignment often translates into AI “hallucinations” that are inconsistent with the multimodal inputs, Fact-RLHF emerges as a general and scalable solution. By merging RLHF’s strength with factual augmentations, this method not only mitigates misalignments but also pioneers in setting robust standards for AI’s vision-language capabilities.

Biography: Zhiqing Sun is a Ph.D. student at CMU LTI, advised by Prof. Yiming Yang. He is supported by the Google PhD Fellowship in Natural Language Processing (2023) and was named one of UChicago’s 2023 Rising Stars in Data Science. His interests are in machine learning and natural language reasoning. His recent research focuses on aligning foundation models in a scalable manner. He is particularly interested in enhancing the reliability of foundation models, including large language models (LLMs) and large multimodal models (LMMs), through minimal human supervision and scalable oversight.

Invited Student Speaker Q&A: 10:40-10:50 AM

Student Speakers

Time: 11 AM-12 PM, February 16

Zhi-Hao Lin

“Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video“

Asher Mai

“Shadows Don’t Lie and Lines Can’t Bend! Generative Models don’t know Projective Geometry…for now“

Shengcao Cao

“SOHES: Self-supervised Open-world Hierarchical Entity Segmentation“

Andy Zhou

“Robust Prompt Optimization for Defending Language Models“