Artificial Intelligence Discovers Secret Equation for ‘Weighing’ Galaxy Clusters

Astrophysicists at the Institute for Advanced Study, the Flatiron Institute and their colleagues have leveraged artificial intelligence to uncover a better way to estimate the mass of colossal clusters of galaxies. The AI discovered that by just adding a simple term to an existing equation, scientists can produce far better mass estimates than they previously had.

The improved estimates will enable scientists to calculate the fundamental properties of the universe more accurately, the astrophysicists reported March 17, 2023, in the Proceedings of the National Academy of Sciences.

“It’s such a simple thing; that’s the beauty of this,” says study co-author Francisco Villaescusa-Navarro, a research scientist at the Flatiron Institute’s Center for Computational Astrophysics (CCA) in New York City. “Even though it’s so simple, nobody before found this term. People have been working on this for decades, and still they were not able to find this.”

The work was led by Digvijay Wadekar of the Institute for Advanced Study in Princeton, New Jersey, along with researchers from the CCA, Princeton University, Cornell University and the Center for Astrophysics | Harvard & Smithsonian.

Understanding the universe requires knowing where and how much stuff there is. Galaxy clusters are the most massive objects in the universe: A single cluster can contain anything from hundreds to thousands of galaxies, along with plasma, hot gas and dark matter. The cluster’s gravity holds these components together. Understanding such galaxy clusters is crucial to pinning down the origin and continuing evolution of the universe.

Perhaps the most crucial quantity determining the properties of a galaxy cluster is its total mass. But measuring this quantity is difficult — galaxies cannot be ‘weighed’ by placing them on a scale. The problem is further complicated because the dark matter that makes up much of a cluster’s mass is invisible. Instead, scientists deduce the mass of a cluster from other observable quantities.

At first glance, this image is dominated by the vibrant glow of the swirling spiral to the lower left of the frame. However, this galaxy is far from the most interesting spectacle here — behind it sits a galaxy cluster. Galaxies are not randomly distributed in space; they swarm together, gathered up by the unyielding hand of gravity, to form groups and clusters. The Milky Way is a member of the Local Group, which is part of the Virgo Cluster, which in turn is part of the 100 000-galaxy-strong Laniakea Supercluster. The galaxy cluster seen in this image is known as SDSS J0333+0651. Clusters such as this can help astronomers understand the distant — and therefore early — Universe. SDSS J0333+0651 was imaged as part of a study of star formation in far-flung galaxies. Star-forming regions are typically not very large, stretching out for a few hundred light-years at most, so it is difficult for telescopes to resolve them at a distance. Even using its most sensitive and highest-resolution cameras, Hubble cannot resolve very distant star-forming regions, so astronomers use a cosmic trick: they search instead for galaxy clusters, which have a gravitational influence so immense that they warp the spacetime around them. This distortion acts like a lens, magnifying the light of galaxies sitting far behind the cluster and producing elongated arcs like the one seen to the left of centre in this image.
This image taken by NASA's Hubble Space Telescope shows a spiral galaxy (bottom left) in front of a large galaxy cluster. New research leveraged an artificial tool to estimate the masses of galaxy clusters more accurately. ESA/Hubble & NASA

In the early 1970s, Rashid Sunyaev, current distinguished visiting professor at the Institute for Advanced Study’s School of Natural Sciences, and his collaborator Yakov B. Zel’dovich developed a new way to estimate galaxy cluster masses. Their method relies on the fact that as gravity squashes matter together, the matter’s electrons push back. That electron pressure alters how the electrons interact with particles of light called photons. As photons left over from the Big Bang’s afterglow hit the squeezed material, the interaction creates new photons. The properties of those photons depend on how strongly gravity is compressing the material, which in turn depends on the galaxy cluster’s heft. By measuring the photons, astrophysicists can estimate the cluster’s mass.

However, this ‘integrated electron pressure’ is not a perfect proxy for mass, because the changes in the photon properties vary depending on the galaxy cluster. Wadekar and his colleagues thought an artificial intelligence tool called ‘symbolic regression’ might find a better approach. The tool essentially tries out different combinations of mathematical operators — such as addition and subtraction — with various variables, to see what equation best matches the data.

Wadekar and his collaborators ‘fed’ their AI program a state-of-the-art universe simulation containing many galaxy clusters. Next, their program, written by CCA research fellow Miles Cranmer, searched for and identified additional variables that might make the mass estimates more accurate.

AI is useful for identifying new parameter combinations that human analysts might overlook. For example, while it is easy for human analysts to identify two significant parameters in a dataset, AI can better parse through high volumes, often revealing unexpected influencing factors.

“Right now, a lot of the machine-learning community focuses on deep neural networks,” Wadekar explained. “These are very powerful, but the drawback is that they are almost like a black box. We cannot understand what goes on in them. In physics, if something is giving good results, we want to know why it is doing so. Symbolic regression is beneficial because it searches a given dataset and generates simple mathematical expressions in the form of simple equations that you can understand. It provides an easily interpretable model.”

The performance of the new equation from symbolic regression is shown in the middle panel, whereas that of the traditional method is shown in the top. The lower panel explicitly quantifies the reduction in the scatter. D. Wadekar et al./Proceedings of the National Academy of Sciences 2023

The researchers’ symbolic regression program handed them a new equation, which was able to better predict the mass of the galaxy cluster by adding a single new term to the existing equation. Wadekar and his collaborators then worked backward from this AI-generated equation and found a physical explanation. They realized that gas concentration correlates with the regions of galaxy clusters where mass inferences are less reliable, such as the cores of galaxies where supermassive black holes lurk. Their new equation improved mass inferences by downplaying the importance of those complex cores in the calculations. In a sense, the galaxy cluster is like a spherical doughnut. The new equation extracts the jelly at the center of the doughnut that can introduce larger errors, and instead concentrates on the doughy outskirts for more reliable mass inferences.

The researchers tested the AI-discovered equation on thousands of simulated universes from the CCA’s CAMELS suite. They found that the equation reduced the variability in galaxy cluster mass estimates by around 20 to 30 percent for large clusters compared with the currently used equation.

The new equation can provide observational astronomers engaged in upcoming galaxy cluster surveys with better insights into the mass of the objects they observe. “There are quite a few surveys targeting galaxy clusters [that] are planned in the near future,” Wadekar noted. “Examples include the Simons Observatory, the Stage 4 CMB experiment and an X-ray survey called eROSITA. The new equations can help us in maximizing the scientific return from these surveys.”

Wadekar also hopes that this publication will be just the tip of the iceberg when it comes to using symbolic regression in astrophysics. “We think that symbolic regression is highly applicable to answering many astrophysical questions,” he said. “In a lot of cases in astronomy, people make a linear fit between two parameters and ignore everything else. But nowadays, with these tools, you can go further. Symbolic regression and other artificial intelligence tools can help us go beyond existing two-parameter power laws in a variety of different ways, ranging from investigating small astrophysical systems like exoplanets, to galaxy clusters, the biggest things in the universe.”

Information for Press

For more information, please contact Stacey Greenebaum at [email protected].

Recent Articles