Title: Fitting Polynomials to Noisy Data
Abstract:
The problem of finding the polynomial that best fits a noisy data-set (or polynomial reconstruction) has a long history, dating back to curve-fitting problems studied in the 1800s. In the last two decades, there has been tremendous progress on this problem in computer science, driven by the discovery of powerful new algorithms. These results have spurred exciting new developments in Coding theory, Computational learning, Cryptography and Hardness of Approximation. In this talk, we will explore this problem from the perspectives of Coding theory and Computational learning.
We begin with an algorithm for decoding a well-studied family of binary error-correcting codes called Reed-Muller codes, which are obtained from low-degree polynomials. The salient feature of this algorithm is that it works even when the number of errors far exceeds the so-called Johnson bound.
I will present an algorithm for agnostically learning decision trees under the uniform distribution. This is the first polynomial time algorithm for learning decision trees in a harsh noise model. This algorithm solves the reconstruction problem for real polynomials using tools from convex optimization.
I will also discuss settings where the reconstruction problem seems intractable. We will see evidence that the notorious Noisy Parity problem is hard under the uniform distribution. We will see hardness results suggesting that learning simple concepts with noise is impossible for arbitrary distributions.
Biography:
Parikshit Gopalan grew up in India in the city of Bombay (now called Mumbai). He graduated with an undergraduate degree from IIT-Bombay (whose name, thankfully, has not changed). He received his Ph.D from Georgia Tech in August 2006, under the guidance of Dick Lipton. Following this, he did a short stint as a postdoctoral researcher at the University of Texas at Austin. He is currently a postdoc at the University of Washington, visiting Princeton University.
His research focuses on theoretical computer science, especially on algebraic problems arising from algorithms and complexity. He also likes to dabble in other areas such as Data-stream algorithms and Communication complexity.