linkedin reddit search_black sharethis
334 Publications

Chirped amplitude mode in photo-excited superconductors

Thomas Blommel, J. Kaye, Yuta Murakami, Emanuel Gull, Denis Golež

We show that the amplitude mode in superconductors exhibits chirped oscillations under resonant excitation and that the chirping velocity increases as we approach the critical excitation strength. The chirped amplitude mode enables us to determine the local modification of the effective potential even when the system is in a long-lived pre-thermal state. We then show that this chirped amplitude mode is an experimentally observable quantity since the photo-induced (super)-current in pump-probe experiments serves as an efficient proxy for the dynamics of the order parameter, including the chirped dynamics. Our result is based on the attractive Hubbard model using dynamical mean-field theory within the symmetry-broken state after a resonant excitation across the superconducting gap. Since the collective response takes place on emergently long timescales, we extend the hierarchical low-rank compression method for nonequilibrium Green's functions to symmetry-broken states and show that it serves as an efficient representation despite long-lived memory kernels.

Show Abstract

For how many iterations should we run Markov chain Monte Carlo?

C. Margossian, Andrew Gelman

Standard Markov chain Monte Carlo (MCMC) admits three fundamental control parameters: the number of chains, the length of the warmup phase, and the length of the sampling phase. These control parameters play a large role in determining the amount of computation we deploy. In practice, we need to walk a line between achieving sufficient precision and not wasting precious computational resources and time. We review general strategies to check the length of the warmup and sampling phases, and examine the three control parameters of MCMC in the contexts of CPU- and GPU-based hardware. Our discussion centers around three tasks: (1) inference about a latent variable, (2) computation of expectation values and quantiles, and (3) diagnostics to check the reliability of the estimators. This chapter begins with general recommendations on the control parameters of MCMC, which have been battle-tested over the years and often motivate defaults in Bayesian statistical software. Usually we do not know ahead of time how a sampler will interact with a target distribution, and so the choice of MCMC algorithm and its control parameters, tend to be based on experience, re-evaluated after simulations have been obtained and analyzed. The second part of this chapter provides a theoretical motivation for our recommended approach, with pointers to some concerns and open problems. We also examine recent developments on the algorithmic and hardware fronts, which motivate new computational approaches to MCMC.

Show Abstract

The magnetic gradient scale length explains why certain plasmas require close external magnetic coils

John Kappel, Matt Landreman, D. Malhotra

The separation between the last closed flux surface of a plasma and the external coils that magnetically confine it is a limiting factor in the construction of fusion-capable plasma devices. This plasma-coil separation must be large enough so that components such as a breeding blanket and neutron shielding can fit between the plasma and the coils. Plasma-coil separation affects reactor size, engineering complexity, and particle loss due to field ripple. For some plasmas it can be difficult to produce the desired flux surface shaping with distant coils, and for other plasmas it is infeasible altogether. Here, we seek to understand the underlying physics that limits plasma-coil separation and explain why some configurations require close external coils. In this paper, we explore the hypothesis that the limiting plasma-coil separation is set by the shortest scale length of the magnetic field as expressed by the tensor. We tested this hypothesis on a database of 40 stellarator and tokamak configurations. Within this database, the coil-to-plasma distance compared to the minor radius varies by over an order of magnitude. The magnetic scale length is well correlated to the coil-to-plasma distance of actual coil designs generated using the REGCOIL method (Landreman 2017 Nucl. Fusion 57 046003). Additionally, this correlation reveals a general trend that larger plasma-coil separation is possible with a small number of field periods.

Show Abstract

Scaling Laws for Associative Memories

Vivien Cabannes , Elvis Dohmatob, A. Bietti

Learning arguably involves the discovery and memorization of abstract rules. The aim of this paper is to study associative memory mechanisms. Our model is based on high-dimensional matrices consisting of outer products of embeddings, which relates to the inner layers of transformer language models. We derive precise scaling laws with respect to sample size and parameter size, and discuss the statistical efficiency of different estimators, including optimization-based algorithms. We provide extensive numerical experiments to validate and interpret theoretical results, including fine-grained visualizations of the stored memory associations.

Show Abstract

Generalization in diffusion models arises from geometry-adaptive harmonic representations

Zahra Kadkhodaie, Florentin Guth, E. P. Simoncelli, S. Mallat

Deep neural networks (DNNs) trained for image denoising are able to generate high-quality samples with score-based reverse diffusion algorithms. These impressive capabilities seem to imply an escape from the curse of dimensionality, but recent reports of memorization of the training set raise the question of whether these networks are learning the "true" continuous density of the data. Here, we show that two DNNs trained on non-overlapping subsets of a dataset learn nearly the same score function, and thus the same density, when the number of training images is large enough. In this regime of strong generalization, diffusion-generated images are distinct from the training set, and are of high visual quality, suggesting that the inductive biases of the DNNs are well-aligned with the data density. We analyze the learned denoising functions and show that the inductive biases give rise to a shrinkage operation in a basis adapted to the underlying image. Examination of these bases reveals oscillating harmonic structures along contours and in homogeneous regions. We demonstrate that trained denoisers are inductively biased towards these geometry-adaptive harmonic bases since they arise not only when the network is trained on photographic images, but also when it is trained on image classes supported on low-dimensional manifolds for which the harmonic basis is suboptimal. Finally, we show that when trained on regular image classes for which the optimal basis is known to be geometry-adaptive and harmonic, the denoising performance of the networks is near-optimal.

Show Abstract

Directional Smoothness and Gradient Methods: Convergence and Adaptivity

Aaron Mishkin, Ahmed Khaled, Yuanhao Wang, Aaron Defazio, R. M. Gower

We develop new sub-optimality bounds for gradient descent (GD) that depend on the conditioning of the objective along the path of optimization, rather than on global, worst-case constants. Key to our proofs is directional smoothness, a measure of gradient variation that we use to develop upper-bounds on the objective. Minimizing these upper-bounds requires solving implicit equations to obtain a sequence of strongly adapted step-sizes; we show that these equations are straightforward to solve for convex quadratics and lead to new guarantees for two classical step-sizes. For general functions, we prove that the Polyak step-size and normalized GD obtain fast, path-dependent rates despite using no knowledge of the directional smoothness. Experiments on logistic regression show our convergence guarantees are tighter than the classical theory based on -smoothness.

Show Abstract

A new provably stable weighted state redistribution algorithm

We propose a practical finite volume method on cut cells using state redistribution. Our algorithm is provably monotone, total variation diminishing, and GKS (Gustafsson, Kreiss, Sundström) stable in many situations, and shuts off continuously as the cut cell size approaches a target value. Our analysis reveals why original state redistribution works so well: it results in a monotone scheme for most configurations, though at times subject to a slightly smaller CFL condition. Our analysis also explains why a premerging step is beneficial. We show computational experiments in two and three dimensions.

Show Abstract

Uniform approximation of common Gaussian process kernels using equispaced Fourier grids

A. Barnett, Philip Greengard, Ph.D., M. Rachh

The high efficiency of a recently proposed method for computing with Gaussian processes relies on expanding a (translationally invariant) covariance kernel into complex exponentials, with frequencies lying on a Cartesian equispaced grid. Here we provide rigorous error bounds for this approximation for two popular kernels—Matérn and squared exponential—in terms of the grid spacing and size. The kernel error bounds are uniform over a hypercube centered at the origin. Our tools include a split into aliasing and truncation errors, and bounds on sums of Gaussians or modified Bessel functions over various lattices. For the Matérn case, motivated by numerical study, we conjecture a stronger Frobenius-norm bound on the covariance matrix error for randomly-distributed data points. Lastly, we prove bounds on, and study numerically, the ill-conditioning of the linear systems arising in such regression problems.

Show Abstract

Explainable Equivariant Neural Networks for Particle Physics: PELICAN

A. Bogatskii, Timothy Hoffman, David W. Miller, Jan T. Offermann, Xiaoyang Liu

PELICAN is a novel permutation equivariant and Lorentz invariant or covariant aggregator network designed to overcome common limitations found in architectures applied to particle physics problems. Compared to many approaches that use non-specialized architectures that neglect underlying physics principles and require very large numbers of parameters, PELICAN employs a fundamentally symmetry group-based architecture that demonstrates benefits in terms of reduced complexity, increased interpretability, and raw performance. We present a comprehensive study of the PELICAN algorithm architecture in the context of both tagging (classification) and reconstructing (regression) Lorentz-boosted top quarks, including the difficult task of specifically identifying and measuring the $W$-boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state. We also extend the application of PELICAN to the tasks of identifying quark-initiated vs.~gluon-initiated jets, and a multi-class identification across five separate target categories of jets. When tested on the standard task of Lorentz-boosted top-quark tagging, PELICAN outperforms existing competitors with much lower model complexity and high sample efficiency. On the less common and more complex task of 4-momentum regression, PELICAN also outperforms hand-crafted, non-machine learning algorithms. We discuss the implications of symmetry-restricted architectures for the wider field of machine learning for physics.

Show Abstract

Provable convergence guarantees for black-box variational inference

Justin Domke, R. M. Gower, Guillaume Garrigos

Black-box variational inference is widely used in situations where there is no proof that its stochastic optimization succeeds. We suggest this is due to a theoretical gap in existing stochastic optimization proofs—namely the challenge of gradient estimators with unusual noise bounds, and a composite non-smooth objective. For dense Gaussian variational families, we observe that existing gradient estimators based on reparameterization satisfy a quadratic noise bound and give novel convergence guarantees for proximal and projected stochastic gradient descent using this bound. This provides rigorous guarantees that methods similar to those used in practice converge on realistic inference problems.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates