CCM: Publications

A class of dimension-free metrics for the convergence of empirical measures

J. Han, Ruimeng Hu, Jihao Long

This paper concerns the convergence of empirical measures in high dimensions. We propose a new class of probability metrics and show that under such metrics, the convergence is free of the curse of dimensionality (CoD). Such a feature is critical for high-dimensional analysis and stands in contrast to classical metrics (e.g., the Wasserstein metric). The proposed metrics fall into the category of integral probability metrics, for which we specify criteria of test function spaces to guarantee the property of being free of CoD. Examples of the selected test function spaces include the reproducing kernel Hilbert spaces, Barron space, and flow-induced function spaces. Three applications of the proposed metrics are presented: 1. The convergence of empirical measure in the case of random variables; 2. The convergence of n-particle system to the solution to McKean–Vlasov stochastic differential equation; 3. The construction of an ɛ-Nash equilibrium for a homogeneous n-player game by its mean-field limit. As a byproduct, we prove that, given a distribution close to the target distribution measured by our metric and a certain representation of the target distribution, we can generate a distribution close to the target one in terms of the Wasserstein metric and relative entropy. Overall, we show that the proposed class of metrics is a powerful tool to analyze the convergence of empirical measures in high dimensions without CoD.

Show Abstract

A Neural Network Warm-Start Approach for the Inverse Acoustic Obstacle Scattering Problem

Mo Zhou, J. Han, M. Rachh, Carlos Borges

In this paper, we consider the inverse acoustic obstacle problem for sound-soft star-shaped obstacles in two dimensions wherein the boundary of the obstacle is determined from measurements of the scattered field at a collection of receivers outside the object. One of the standard approaches for solving this problem is to reformulate it as an optimization problem: finding the boundary of the domain that minimizes the L2 distance between computed values of the scattered field and the given measurement data. The optimization problem is computationally challenging since the local set of convexity shrinks with increasing frequency and results in an increasing number of local minima in the vicinity of the true solution. In many practical experimental settings, low frequency measurements are unavailable due to limitations of the experimental setup or the sensors used for measurement. Thus, obtaining a good initial guess for the optimization problem plays a vital role in this environment. We present a neural network warm-start approach for solving the inverse scattering problem, where an initial guess for the optimization problem is obtained using a trained neural network. We demonstrate the effectiveness of our method with several numerical examples. For high frequency problems, this approach outperforms traditional iterative methods such as Gauss-Newton initialized without any prior (i.e., initialized using a unit circle), or initialized using the solution of a direct method such as the linear sampling method. The algorithm remains robust to noise in the scattered field measurements and also converges to the true solution for limited aperture data. However, the number of training samples required to train the neural network scales exponentially in frequency and the complexity of the obstacles considered. We conclude with a discussion of this phenomenon and potential directions for future research.

Show Abstract

Variational Inference with Gaussian Score Matching

C. Modi, C. Margossian, Y. Yao, R. M. Gower, D. Blei, L. Saul

Variational inference (VI) is a method to approximate the computationally intractable posterior distributions that arise in Bayesian statistics. Typically, VI fits a simple parametric distribution to be close to the target posterior, optimizing an appropriate objective such as the evidence lower bound (ELBO). In this work, we present a new approach to VI. Our method is based on the principle of score matching---namely, that if two distributions are equal then their score functions (i.e., gradients of the log density) are equal at every point on their support. With this principle, we develop score-matching VI, an iterative algorithm that seeks to match the scores between the variational approximation and the exact posterior. At each iteration, score-matching VI solves an inner optimization, one that minimally adjusts the current variational estimate to match the scores at a newly sampled value of the latent variables. We show that when the variational family is a Gaussian, this inner optimization enjoys a closed-form solution, which we call Gaussian score matching VI (GSM-VI). GSM-VI is a ``black box'' variational algorithm in that it only requires a differentiable joint distribution, and as such it can be applied to a wide class of models. We compare GSM-VI to black box variational inference (BBVI), which has similar requirements but instead optimizes the ELBO. We first study how GSM-VI behaves as a function of the problem dimensionality, the condition number of the target covariance matrix (when the target is Gaussian), and the degree of mismatch between the approximating and exact posterior distribution. We then study GSM-VI on a collection of real-world Bayesian inference problems from the posteriorDB database of datasets and models. We find that GSM-VI is faster than BBVI and equally or more accurate. Specifically, over a wide range of target posteriors, GSM-VI requires 10-100x fewer gradient evaluations than BBVI to obtain a comparable quality of approximation.

Show Abstract

A Dual-space Multilevel Kernel-splitting Framework for Discrete and Continuous Convolution

S. Jiang, L. Greengard

We introduce a new class of multilevel, adaptive, dual-space methods for computing fast convolutional transforms. These methods can be applied to a broad class of kernels, from the Green's functions for classical partial differential equations (PDEs) to power functions and radial basis functions such as those used in statistics and machine learning. The DMK (dual-space multilevel kernel-splitting) framework uses a hierarchy of grids, computing a smoothed interaction at the coarsest level, followed by a sequence of corrections at finer and finer scales until the problem is entirely local, at which point direct summation is applied. The main novelty of DMK is that the interaction at each scale is diagonalized by a short Fourier transform, permitting the use of separation of variables, but without requiring the FFT for its asymptotic performance. The DMK framework substantially simplifies the algorithmic structure of the fast multipole method (FMM) and unifies the FMM, Ewald summation, and multilevel summation, achieving speeds comparable to the FFT in work per gridpoint, even in a fully adaptive context. For continuous source distributions, the evaluation of local interactions is further accelerated by approximating the kernel at the finest level as a sum of Gaussians with a highly localized remainder. The Gaussian convolutions are calculated using tensor product transforms, and the remainder term is calculated using asymptotic methods. We illustrate the performance of DMK for both continuous and discrete sources with extensive numerical examples in two and three dimensions.

Show Abstract

Compressing the memory variables in constant-Q viscoelastic wave propagation via an improved sum-of-exponentials approximation

Xu Guo, S. Jiang, Yunfeng Xiong, Jiwei Zhang

Earth introduces strong attenuation and dispersion to propagating waves. The time-fractional wave equation with very small fractional exponent, based on Kjartansson's constant-Q theory, is widely recognized in the field of geophysics as a reliable model for frequency-independent Q anelastic behavior. Nonetheless, the numerical resolution of this equation poses considerable challenges due to the requirement of storing a complete time history of wavefields. To address this computational challenge, we present a novel approach: a nearly optimal sum-of-exponentials (SOE) approximation to the Caputo fractional derivative with very small fractional exponent, utilizing the machinery of generalized Gaussian quadrature. This method minimizes the number of memory variables needed to approximate the power attenuation law within a specified error tolerance. We establish a mathematical equivalence between this SOE approximation and the continuous fractional stress-strain relationship, relating it to the generalized Maxwell body model. Furthermore, we prove an improved SOE approximation error bound to thoroughly assess the ability of rheological models to replicate the power attenuation law. Numerical simulations on constant-Q viscoacoustic equation in 3D homogeneous media and variable-order P- and S- viscoelastic wave equations in 3D inhomogeneous media are performed. These simulations demonstrate that our proposed technique accurately captures changes in amplitude and phase resulting from material anelasticity. This advancement provides a significant step towards the practical usage of the time-fractional wave equation in seismic inversion.

Show Abstract

A Gentle Introduction to Gradient-Based Optimization and Variational Inequalities for Machine Learning

N. Wadia, Yatin Dandi, Michael I. Jordan

The rapid progress in machine learning in recent years has been based on a highly productive connection to gradient-based optimization. Further progress hinges in part on a shift in focus from pattern recognition to decision-making and multi-agent problems. In these broader settings, new mathematical challenges emerge that involve equilibria and game theory instead of optima. Gradient-based methods remain essential -- given the high dimensionality and large scale of machine-learning problems -- but simple gradient descent is no longer the point of departure for algorithm design. We provide a gentle introduction to a broader framework for gradient-based algorithms in machine learning, beginning with saddle points and monotone games, and proceeding to general variational inequalities. While we provide convergence proofs for several of the algorithms that we present, our main focus is that of providing motivation and intuition.

Show Abstract

Liquid Filled Elastomers: From Linearization to Elastic Enhancement

Juan Casado Dìaz, G. Francfort

Surface tension at cavity walls can play havoc with the mechanical properties of perforated soft solids when the cavities are filled with a fluid. This study is an investigation of the macroscopic elastic properties of elastomers embedding spherical cavities filled with a pressurized liquid in the presence of surface tension, starting with the linearization of the fully nonlinear model and ending with the enhancement properties of the linearized model when many such liquid filled cavities are present.

Show Abstract

Reinforcement learning with function approximation: From linear to nonlinear

Jihao Long, J. Han

Function approximation has been an indispensable component in modern reinforcement learning algorithms designed to tackle problems with large state spaces in high dimensions. This paper reviews recent results on error analysis for these reinforcement learning algorithms in linear or nonlinear approximation settings, emphasizing approximation error and estimation error/sample complexity. We discuss various properties related to approximation error and present concrete conditions on transition probability and reward function under which these properties hold true. Sample complexity analysis in reinforcement learning is more complicated than in supervised learning, primarily due to the distribution mismatch phenomenon. With assumptions on the linear structure of the problem, numerous algorithms in the literature achieve polynomial sample complexity with respect to the number of features, episode length, and accuracy, although the minimax rate has not been achieved yet. These results rely on the $L^∞$ and UCB estimation of estimation error, which can handle the distribution mismatch phenomenon. The problem and analysis become substantially more challenging in the setting of nonlinear function approximation, as both $L^∞$ and UCB estimation are inadequate for bounding the error with a favorable rate in high dimensions. We discuss additional assumption necessary to address the distribution mismatch and derive meaningful results for nonlinear RL problems.

Show Abstract

An equivariant neural operator for developing nonlocal tensorial constitutive models

J. Han, Xu-Hui Zhou, Heng Xiao

Developing robust constitutive models is a fundamental and longstanding problem for accelerating the simulation of complicated physics. Machine learning provides promising tools to construct constitutive models based on various calibration data. In this work, we propose a neural operator to develop nonlocal constitutive models for tensorial quantities through a vector-cloud neural network with equivariance (VCNN-e). The VCNN-e respects all the invariance properties desired by constitutive models, faithfully reflects the region of influence in physics, and is applicable to different spatial resolutions. By design, the model guarantees that the predicted tensor is invariant to the frame translation and ordering (permutation) of the neighboring points. Furthermore, it is equivariant to the frame rotation, i.e., the output tensor co-rotates with the coordinate frame. We evaluate the VCNN-e by using it to emulate the Reynolds stress transport model for turbulent flows, which directly computes the Reynolds stress tensor to close the Reynolds-averaged Navier--Stokes (RANS) equations. The evaluation is performed in two situations: (1) emulating the Reynolds stress model through synthetic data generated from the Reynolds stress transport equations with closure models, and (2) predicting the Reynolds stress by learning from data generated from direct numerical simulations. Such a priori evaluations of the proposed network pave the way for developing and calibrating robust and nonlocal, non-equilibrium closure models for the RANS equations.

Show Abstract

Automatic, high-order, and adaptive algorithms for Brillouin zone integration

J. Kaye, Sophie Beck, A. Barnett, Lorenzo Van Muñoz, Olivier Parcollet

We present efficient methods for Brillouin zone integration with a non-zero but possibly very small broadening factor η, focusing on cases in which downfolded Hamiltonians can be evaluated efficiently using Wannier interpolation. We describe robust, high-order accurate algorithms automating convergence to a user-specified error tolerance ϵ, emphasizing an efficient computational scaling with respect to η. After analyzing the standard equispaced integration method, applicable in the case of large broadening, we describe a simple iterated adaptive integration algorithm effective in the small η regime. Its computational cost scales as O(log3(η−1)) as η → 0+ in three dimensions, as opposed to O(η−3) for equispaced integration. We argue that, by contrast, tree-based adaptive integration methods scale only as O(log(η−1)/η2) for typical Brillouin zone integrals. In addition to its favorable scaling, the iterated adaptive algorithm is straightforward to implement, particularly for integration on the irreducible Brillouin zone, for which it avoids the tetrahedral meshes required for tree-based schemes. We illustrate the algorithms by calculating the spectral function of SrVO3 with broadening on the meV scale.

Show Abstract