One Model to Rule Them All: Creating a Machine Learning Foundation Model for Science
In this age of artificial intelligence, machine learning is often presented as a panacea for our technological woes. Machine learning programs can help us navigate, find an interesting TV show to watch, and even complete our homework. In science, machine learning has long played a critical role, doing everything from categorizing galaxies to identifying genetic disorders to synthesizing new materials. But now, scientists are looking to take machine learning applications to another level.
At the Flatiron Institute, Ruben Ohana is part of the Polymathic AI team, which aims to create very large machine learning models that can be used for a wide range of scientific problems, from fluid dynamics to the evolution of the solar system. This work could make it much easier to analyze big scientific problems that have little data to start with.
Before joining the Flatiron Institute’s Center for Computational Mathematics (CCM) as a Research Fellow, Ohana received his doctorate in machine learning from the École Normale Supérieure in Paris. Prior to that he earned an engineering degree in physics from ESPCI Paris, a master’s degree in condensed matter from the École Normale Supérieure, and a master’s degree in statistics from Sorbonne University.
Ohana recently spoke to the Simons Foundation about his work and its applications. The conversation has been edited for length and clarity.
Can you tell me about your work and what you’re currently focused on?
My work is in applied machine learning. I study many different topics within this broad field, but mostly I specialize in optimization, which is how we can train a machine learning model to learn faster. I’m part of an initiative at the Flatiron Institute called Polymathic AI where we are building large models, somewhat similar to machine learning models such as ChatGPT, but to help solve problems in science. Typically, machine learning models are small and can only solve the specialized problems they are trained on. The hope is that by scaling up a model to make it bigger and bigger, which is the current trend in modern machine learning, we can make a supersize model that will be able to solve many tasks from different sources.
What kinds of problems do you hope to solve with this model?
For the moment, we are focusing on problems in astrophysics and fluid dynamics. The goal is to be able to predict how a fluid — be it a jet of water or the prediction of flares on the sun — will evolve in time.
Ultimately, we want to have a model that will work well in what’s known as the low-data regime. That’s an area or a problem where scientists might not have much data to start with. The idea is we’ll start with specific problems, and then once the model is trained on those, it’ll be able to extrapolate and go to different cases that might not have as much data initially.
What kinds of real-world problems can this model help solve?
We hope the model will be useful with any physics-based time series data — data that shows the evolution of a system. This can be astrophysics, fluid dynamics or any other dynamics that are based in the laws of physics. Some specific examples would be climate modeling or reconstructing the evolution of the solar system. For example, we have datasets about the size and motion of planets around a star, and the hope is this model could be used to figure out how the system evolved over time and figure out what was the mass and position of a planet at a given time.
What does this work look like for you day to day?
I collaborate with 15 other people at the Flatiron Institute in this initiative, and everyone focuses on different paths. Currently I’m working closely with a Ph.D. student interning with us. Right now, we’re focusing on coding transformers, which are these pieces in the model that help it learn correlation. For example, in a language model like ChatGPT, transformers help the model understand a sentence like “The black cat is jumping on the table,” and know that “black” refers to the cat, not the table. In our case, transformers are helping us understand how different timescales are related. So, if you have a simulation of a planetary system over time, they’d help correlate day 1 with day 1,000.
Are you working closely with scientists to create these models?
Actually, most of the people in the initiative, including myself, are physicists, so we already have this base knowledge and a wide range of expertise. This helps us have the background to train the model on many different physics.
The Flatiron Institute is the perfect place to do this work because we have all the knowledge in house. If tomorrow we wanted to start a component of the model for biology, we could, because we already have specialists here for that. This project is very exciting to work on, and I think there’s no place I’d rather do it.