Machine Learning for Signal Processing

Sponsored by VAIL Systems!

Session Flyer

 

In the current wave of artificial intelligence, machine learning, which aims at extracting practical information from data, is the driving force of many applications; and signals, which represent the world around us, provide a great application area for machine learning. In addition, development of machine learning algorithms, such as deep learning, advances signal processing by providing new tools and makes it possible to solve signal processing problems that were considered difficult with traditional approaches.

The theme of this session is thus to present research ideas from machine learning and signal processing. We welcome all research works related to (but not limited to) the following areas: deep learning, neural networks, statistical inference, computer vision, image and video processing, speech and audio processing, pattern recognition, information-theoretic signal processing.

Keynote Speaker – Dr. Michael Picheny, NYU-Courant Computer Science and the Center for Data Science

Speech Recognition: What’s Left?

Abstract:

Recent speech recognition advances on the SWITCHBOARD corpus suggest that because of recent advances in Deep Learning, we now achieve Word Error Rates comparable to human listeners. Does this mean the speech recognition problem is solved and the community can move on to a different set of problems? In this talk, we examine speech recognition issues that still plague the community and compare and contrast them to what is known about human perception. We specifically highlight issues in accented speech, noisy/reverberant speech, speaking style, rapid adaptation to new domains, and multilingual speech recognition. We try to demonstrate that compared to human perception, there is still much room for improvement, so significant work in speech recognition researchis still required from the community.

Bio
Dr. Picheny has worked in the Speech Recognition area since 1981, joining IBM after finishing his doctorate at MIT. He was heavily involved in the development of almost all of IBM’s recognition systems, ranging from the world’s first real-time large vocabulary discrete system in 1984 through IBM’s product lines for telephony and embedded systems in the 1990s, and most recently was responsible for putting out a set of Speech Services for both Speech Recognition and Speech Synthesis during his tenure in IBM’s Watson Group. He has published numerous papers in both journals and conferences on almost all aspects of speech recognition (see web page for details). He is the co-holder of over 50 patents and was named a Master Inventor by IBM in 1995 and again in 2000. In addition to professional volunteer service (as indicated below), he served multiple times as an Adjunct Professor in the Electrical Engineering Department of Columbia University and co-taught a course in speech recognition. He is a Fellow of both the IEEE and of ISCA.

Dr. Picheny was a manager for 35 years in the Speech area at IBM, and led the Speech team in Yorktown Heights since 2007. He just retired from IBM and joined NYU-Courant Computer Science and the Center for Data Science as a part-time Research Professor. At NYU, he hopes to continue speech recognition research and focus on problems dealing with challenging types of speech problems such as accented and disfluent speech, and rapid domain adaptation, as well as looking into cross-modality synergies involving text and vision.

Student Speakers

 

Avatar Invited Speaker – Abolfazl Hashemi, University of Texas at Austin
Progressive Stochastic Greedy Sparse Reconstruction and Support Selection
Avatar Peixin Chang, UofI
Robot Sound Interpretation: Combining Sight and Sound in Learning-Based Control
Avatar Raymond Yeh, UofI
Chirality Nets for Human Pose Regression
Avatar Varun Kelkar, UofI
Compressible Latent Space Invertible Networks for Compressive MRI
Avatar Ziewi Ji, UofI
Polylogarithmic Width Suffices for Gradient Descent to Achieve Arbitrarily Small Test Error with Shallow ReLU Networks

Contact

For additional details, feel free to contact the session chair, Leda Sari.