New Directions in Theoretical Machine Learning (2019)

Name: New Directions in Theoretical Machine Learning (2019)
Start: 2019-05-05T00:00:00-04:00
End: 2019-05-11T23:59:59-04:00
Location: Schloss Elmau

May 5 - 11, 2019

Organizers:
Sanjeev Arora, Princeton University
Maria-Florina Balcan, Carnegie Mellon University
Sanjoy Dasgupta, University of California, San Diego
Sham Kakade, University of Washington

The Simons Symposium on New Directions in Theoretical Machine Learning brought together a select group of experts with diverse skills and backgrounds to discuss the following questions:

How can we build on the recent success in supervised learning for perceptual and related tasks?
What’s next for ML if perception gets solved?
Is the current set of methods sufficient to take us to the next level of “intelligent” reasoning?
If not, what is missing, and how can we rectify it?
What role can classical ideas in Reasoning, Representation Learning, Reinforcement Learning, Interactive Learning, etc. have to play?
What modes of analyses do we need to even conceptualize the next level of machine capabilities, and what will be good ways to test those capabilities?

Overview: The workshop brought together a select group of experts with different backgrounds to discuss the next set of theoretical challenges for advancing machine learning and artificial intelligence. There have a been a number of breakthrough developments in the last few years: language models are rapidly getting better (in the last few months, there have been impressive advances in computer generated text); image recognition is working far more accurately in new domains (relevant to scene parsing and navigation); and progress in robotics is rapidly accelerating (with much focus on self-driving cars). The workshop had leading experts in all of these areas and focused on formulating new foundational and algorithmic questions to accelerate the progress of the field.

Developments and Novel Questions:

What is the role of models and physics in robotics, reinforcement learning, interactive learning and self-driving cars?

A central question in robotics and interactive learning is how to make use of models of the environment (such as physical simulators). While much of machine learning does not utilize models of the world (ML is largely prediction based), the area of robotics is one where it is increasingly clear that models of the environment are important. D. Bagnell is one of the world’s leading experts in self-driving cars, along with being a leading robotics theorist, and he discussed the interplay of human demonstration with physical models; the focus was on how to obtain provably better planning algorithms (say for self-driving cars) utilizing models of the world along with human examples. E. Todorov is a robotics theorist and has also developed one of the most widely used physics simulators in robotics; he spoke about how ML methods for robotics are unlikely to succeed without incorporating physical models. E. Brunskill discussed the fundamental limits of how accurately we need to learn a model in order to utilize a model of planning purposes.

How do our current set of methods need to be modified in order to take us to the next level of ‘intelligent’ reasoning?

There has been much recent remarkable success in natural language processing. There were a number of talks related to the key shortcomings in current language modeling approaches, in terms of a lack of the ability to capture reasoning and long-term dependencies. D. Roth gave a thought-provoking talk on how to incorporate reasoning and logic in language models through a notion of incidental reasoning, where an agent learns a way to reason from co-occurrences in the data. G. van den Broeck talked about systems for probabilistic reasoning that can both be learned and used efficiently, overcoming complexity barriers that have traditionally hobbled the field. S. Kakade discussed the challenges with language models that utilize long-term memory, relevant to reasoning from information stored in the ‘deep past.’ On a related topic of reasoning, C. Szegedy discussed the role of reasoning in the context of mathematics and theorem proving.

How can representation learning and unsupervised learning be used for faster learning in new domains and for finding better perceptual representations?

For example, people learn to identify new objects with just a few examples. There is an increasing belief that such accelerated learning may be possible with machine learning approaches in the near future, where systems can also rapidly learn to identify new objects. P. Isola gave a talk, “Toward the Evolution and Development of Prior Knowledge,” which focused on insightful ideas concerning how to build systems that learn from implicit signals (e.g., how to incorporate video information when learning about object recognition). S. Arora gave a theory of representational learning that elucidates how perceptual representations in language (or other domains including vision) can be learned in an unsupervised manner (without a teacher), which allows for more rapid learning in new contexts.

What are new models for of learning, such as lifelong learning, learning with a teacher, etc.?

There is a growing need to have systems evolve over time and gradually improve from experience. T. Mitchell gave a talk about systems that continually learn from experience based on insights from his Never-Ending Language Learner (NELL) system, which was a continually running algorithm for learning. S. Dasgupta spoke about models in which a benign teacher chooses examples that enable a learner to quickly acquire an accurate model. D. Sadigh and Y. Yue discussed challenges in settings where a robot or other mechanical device must interact with and learn from a human.

Other topics on the frontiers of machine learning
1. To what extent, and in what ways, can causality be inferred from data? (B. Scholkopf)
2. How can data be used to guide the design of algorithms? (N. Balcan)
3. What is the new theoretical landscape of privacy models, in the wake of recent EU regulation? (K. Chaudhuri)
Another important direction discussed was data-driven algorithm design for combinatorial problems, an important aspect of modern data science and algorithm design. Rather than using off-the-shelf algorithms that only have worst-case performance guarantees, practitioners typically optimize over large families of parameterized algorithms and tune the parameters of these algorithms using a training set of problem instances from their domain to determine a configuration with high expected performance over future instances. However, so far, most of this work came with no performance guarantees. Nina Balcan presented recent provable guarantees for these scenarios, both for the batch and online scenarios where a collection of typical problem instances from the given application are presented either all at once or in an online fashion, respectively. The challenge is that for many combinatorial problems of significant importance to machine learning, including partitioning and subset selection problems, a small tweak to the parameters can cause a cascade of changes in the algorithm’s behavior, so the algorithm’s performance is a discontinuous function of its parameters, which leads to very interesting learning theory questions as well.

Future collaborations:

The interactions between the researchers were excellent, with active discussion and potential future follow-ups in a number of areas. These include:
- Tengyu Ma and Sham Kakade plan to examine the question of implicit regularization in deep neural networks in language models.
- Drew Bagnell, Sham Kakade, Emo Todorov and Elad Hazan discussed questions of error feedback learning (and iterative learning control) as robust control methods, where we could obtain provable guarantees.
- Nina Balcan and Yisong Yue discussed providing fast branch and bound algorithms with provable guarantees for solving mixed-integer programs by using a data-driven algorithm design approach.
- Nina Balcan, Sanjoy Dasgupta, Guy van den Broeck and Dan Roth talked about label-efficient learning in large-scale multitask scenarios where one can exploit known or learned logical constraints among tasks.
- Phillip Isola and Sham Kakade discussed questions regarding a theory and algorithm for learning from ‘two views’ (e.g., learn from a video stream, where one view is the past and the other is the ‘future’).
- Phillip Isola and Sanjeev Arora plan to further discuss ideas for representation learning.

Agenda & Slides

10:00 - 11:00 AM	Christian Szegedy \| Is Math Only for Humans? View Slides
11:30 - 12:30 PM	Tom Mitchell \| What Questions Should a Theory of Learning Agent Answer? View Slides
5:00 - 6:00 PM	Dan Roth \| Incidental Supervision and Reasoning Challenges View Slides
6:15 - 7:15 PM	Sanjeev Arora \| Discussion: Thoughts on a theory for unsupervised learning, with applications to RL, learning to learn, etc. View Slides

10:00 - 11:00 AM	Drew Bagnell \| Imitation, Feedback and Games
11:30 - 12:30 PM	Emo Todorov \| Model-based Control View Slides
	Andreas Krause \| Towards Safe Reinforcement Learning
5:00 - 6:00 PM	Yisong Yue \| Blending Learning & Control via Functional Regularization View Slides
6:15 - 7:15 PM	Emma Brunskill \| Machine Learning Challenges from Computers that Learn to Help People View Slides

9:45 - 2:00 PM	Guided Hike to Partnach Gorge
5:00 - 6:00 PM	Elad Hazan \| New Algorithms and Directions in Reinforcement Learning View Slides
6:15 - 7:15 PM	Phillip Isola \| From Amoebas to Infants: Toward the Evolution and Development of Prior Knowledge View Slides

10:00 - 11:00 AM	Dorsa Sadigh \| Machine Learning for Human-Robot Systems View Slides
11:30 - 12:30 PM	Nina Balcan \| Data-driven/Machine-learning Augmented Algorithm Design View Slides
5:00 - 6:00 PM	Bernhard Schölkopf \| Toward causal learning View Slides
6:15 - 7:15 PM	Sanjoy Dasgupta \| Using Interaction to Overcome Basic Hurdles in Learning View Slides
	Daniel Hsu \| Interactive Learning via Reductions View Slides

10:00 - 11:00 AM	Ulrike Luxburg \| Comparison-based Machine Learning
	Sham Kakade \| Learning, Memory, and Entropy View Slides
11:30 - 12:30 PM	Guy Van den Broeck \| Circuit Languages as a Synthesis of Learning and Reasoning View Slides
5:00 - 6:00 PM	Kamalika Chaudhuri \| New Directions in Privacy-preserving Data Analysis View Slides
	Moritz Hardt \| The sociotechnical forces against overfitting View Slides
6:15 - 7:15 PM	Tengyu Ma \| Data-dependent Regularization and Sample Complexity Bounds of Deep Neural Networks View Slides

Participants

Sanjeev Arora	Princeton University
Drew Bagnell	Aurora Innovation
Maria-Florina Balcan	Carnegie Mellon University
Emma Brunskill	Stanford University
Kamalika Chaudhuri	University of California, San Diego
Sanjoy Dasgupta	University of California, San Diego
Moritz Hardt	University of California Berkeley
Elad Hazan	Princeton University
Daniel Hsu	Columbia University
Phillip Isola	Massachusetts Institute of Technology
Sham Kakade	University of Washington
Andreas Krause	ETH Zürich
Tengyu Ma	Stanford University
Tom Mitchell	Carnegie Mellon University
Dan Roth	University of Pennsylvania
Dorsa Sadigh	Stanford University
Bernard Schölkopf	Max Planck Institute for Intelligent Systems
Christian Szegedy	Google
Emo Todorov	University of Washington
Guy van den Broeck	University of California, Los Angeles
Ulrike von Luxburg	Universität of Tübingen
Yisong Yue	California Institute of Technology

10:00 - 11:00 AM	Drew Bagnell \| Imitation, Feedback and Games
11:30 - 12:30 PM	Emo Todorov \| Model-based Control View Slides
	Andreas Krause \| Towards Safe Reinforcement Learning
5:00 - 6:00 PM	Yisong Yue \| Blending Learning & Control via Functional Regularization View Slides
6:15 - 7:15 PM	Emma Brunskill \| Machine Learning Challenges from Computers that Learn to Help People View Slides

10:00 - 11:00 AM	Dorsa Sadigh \| Machine Learning for Human-Robot Systems View Slides
11:30 - 12:30 PM	Nina Balcan \| Data-driven/Machine-learning Augmented Algorithm Design View Slides
5:00 - 6:00 PM	Bernhard Schölkopf \| Toward causal learning View Slides
6:15 - 7:15 PM	Sanjoy Dasgupta \| Using Interaction to Overcome Basic Hurdles in Learning View Slides
	Daniel Hsu \| Interactive Learning via Reductions View Slides

10:00 - 11:00 AM	Ulrike Luxburg \| Comparison-based Machine Learning
	Sham Kakade \| Learning, Memory, and Entropy View Slides
11:30 - 12:30 PM	Guy Van den Broeck \| Circuit Languages as a Synthesis of Learning and Reasoning View Slides
5:00 - 6:00 PM	Kamalika Chaudhuri \| New Directions in Privacy-preserving Data Analysis View Slides
	Moritz Hardt \| The sociotechnical forces against overfitting View Slides
6:15 - 7:15 PM	Tengyu Ma \| Data-dependent Regularization and Sample Complexity Bounds of Deep Neural Networks View Slides

New Directions in Theoretical Machine Learning (2019)

MONDAY

TUESDAY

WEDNESDAY

THURSDAY

FRIDAY