Publications

Inferring biological processes with intrinsic noise from cross-sectional data

Inferring dynamical models from data continues to be a significant challenge in computational biology, especially given the stochastic nature of many biological processes. We explore a common scenario in omics, where statistically independent cross-sectional samples are available at a few time points, and the goal is to infer the underlying diffusion process that generated the data. Existing inference approaches often simplify or ignore noise intrinsic to the system, compromising accuracy for the sake of optimization ease. We circumvent this compromise by inferring the phase-space probability flow that shares the same time-dependent marginal distributions as the underlying stochastic process. Our approach, probability flow inference (PFI), disentangles force from intrinsic stochasticity while retaining the algorithmic ease of ODE inference. Analytically, we prove that for Ornstein-Uhlenbeck processes the regularized PFI formalism yields a unique solution in the limit of well-sampled distributions. In practical applications, we show that PFI enables accurate parameter and force estimation in high-dimensional stochastic reaction networks, and that it allows inference of cell differentiation dynamics with molecular noise, outperforming state-of-the-art approaches.

Show Abstract

Self-propulsion, flocking and chiral active phases from particles spinning at intermediate Reynolds numbers

Panyu Chen, S. Weady, M. Shelley, et al.

Vorticity, a measure of the local rate of rotation of a fluid element, is the driver of incompressible flow. In viscous fluids, powering bulk flows requires the continuous injection of vorticity from boundaries to counteract the diffusive effects of viscosity. Here we power a flow from within by suspending approximately cylindrical particles and magnetically driving them to rotate at Reynolds numbers in the intermediate range. We find that a single particle generates a localized three-dimensional region of vorticity around it—which we call a vortlet—that drives a number of remarkable behaviours. Slight asymmetries in the particle shape can deform the vortlet and cause the particle to self-propel. Interactions between vortlets are similarly rich, generating bound dynamical states. When a large number of vortlets interact, they spontaneously form collectively moving flocks. These flocks remain coherent while propelling, splitting and merging. If enough particles are added so as to saturate the flow chamber, a homogeneous three-dimensional active chiral fluid of vortlets is formed, which can be manipulated with gravity or flow chamber boundaries, leading to lively collective dynamics. Our findings demonstrate an inertial regime for synthetic active matter, provide a controlled physical system for the quantitative study of three-dimensional flocking in non-sentient systems and establish a platform for the study of three-dimensional active chiral fluids.

Show Abstract

Spatiotemporal dynamics of nucleocytoplasmic transport

A. Rautu, Alexandra Zidovska, M. Shelley

Nucleocytoplasmic transport is essential for cellular function, presenting a canonical example of rapid molecular sorting inside cells. It consists of a coordinated interplay between import/export of molecules in/out the cell nucleus. Here, we investigate the role of spatiotemporal dynamics of the nucleocytoplasmic transport and its regulation. We develop a biophysical model that captures the main features of the nucleocytoplasmic transport, in particular, its regulation through the Ran cycle. Our model yields steady-state profiles for the molecular components of the Ran cycle, their relaxation times, as well as the nuclear-to-cytoplasmic molecule ratio. We show that these quantities are affected by their spatial dynamics and heterogeneity within the nucleus. Specifically, we find that the spatial nonuniformity of Ran guanine exchange factor (RanGEF)—particularly its proximity to the nuclear envelope—increases the Ran content in the nucleus. We further show that RanGEF's accumulation near the nuclear envelope results from its intrinsic dynamics as a nuclear cargo, transported by the Ran cycle itself. Overall, our work highlights the critical role of molecular spatial dynamics in cellular processes and proposes new avenues for theoretical and experimental inquiries into the nucleocytoplasmic transport.

Show Abstract

Active learning of Boltzmann samplers and potential energies with quantum mechanical accuracy

Ana Molina-Taborda, P. Cossio, et al.

Extracting consistent statistics between relevant free energy minima of a molecular system is essential for physics, chemistry, and biology. Molecular dynamics (MD) simulations can aid in this task but are computationally expensive, especially for systems that require quantum accuracy. To overcome this challenge, we developed an approach combining enhanced sampling with deep generative models and active learning of a machine learning potential (MLP). We introduce an adaptive Markov chain Monte Carlo framework that enables the training of one normalizing flow (NF) and one MLP per state, achieving rapid convergence toward the Boltzmann distribution. Leveraging the trained NF and MLP models, we compute thermodynamic observables such as free energy differences and optical spectra. We apply this method to study the isomerization of an ultrasmall silver nanocluster belonging to a set of systems with diverse applications in the fields of medicine and catalysis.

Show Abstract

Integral formulation of Klein-Gordon singular waveguides

Guillaume Bal, Jeremy Hoskins, M. Rachh, Solomon Quinn

We consider the analysis of singular waveguides separating insulating phases in two-space dimensions. The insulating domains are modeled by a massive Schrödinger equation and the singular waveguide by appropriate jump conditions along the one-dimensional interface separating the insulators. We present an integral formulation of the problem and analyze its mathematical properties. We also implement a fast multipole and sweeping-accelerated iterative algorithm for solving the integral equations, and demonstrate numerically the fast convergence of this method. Several numerical examples of solutions and scattering effects illustrate our theory.

Show Abstract

8574 Single cell multiomics suggests tumor cell energy metabolism heterogeneity in non-functioning gonadotroph pituitary adenomas

F M Ruf-Zamojski, X. Chen, O. Troyanskaya, R. Sealfon, et al.

Pituitary adenomas cause hormonal dysregulation and severe morbidity. The absence of clear biomarkers for prognosis/treatment and the high risk of recurrence make management challenging. Non-functioning gonadotroph pituitary adenomas (NFPAs) represent proliferation of gonadotroph lineage cells without an increase in secretion of gonadotropins. Detection is often delayed until mass effects cause visual defects. Higher resolution molecular study may improve classification, diagnosis, treatment, and provide insight into adenoma biology. We analyzed four NFPAs using unbiased genome-wide same-cell single nucleus 10X Genomics multiome assay (transcriptome and chromatin accessibility) to define individual cell identity/states in comparison with normal gonadotrope cell references from twelve age/sex-matched human post-mortem pituitaries. In total, we generated high quality multiomics single-cell data for 33,832 adenoma cells and 5,451 normal gonadotropes. We identified tumor, immune, vascular, and proliferative cells confirming cellular heterogeneity. Multidimensional scaling analysis using Manhattan distances between cell types showed that NFPA gonadotropes were closer in gene expression to healthy gonadotropes among all pituitary cell types. In addition, we observed differences in the macrophage and endothelial cell populations between tumors and healthy pituitaries. Next, we studied gene regulatory circuitry (regulated gene, modulating transcription factor and its chromatin interaction site) using the new Control of Regulation Extracted from Multiomics Assays (CREMA, [1]) multiome analysis method which showed high variation between normal and tumor gonadotropes. Within each adenoma, the NFPA tumor cells encompassed two main subgroups of cells, one group (GT+ marker tumor cells) expressing established gonadotrope markers and another group not (GT- marker tumor cells). GT+ and GT- tumor cells formed a cluster distinct from healthy gonadotropes. GT+ and GT- tumor cells differed by their level of expression of mitochondrial genes, suggesting decreased aerobic capacity associated with the GT- tumor subgroup. Interestingly, differential analyses between the healthy and the NFPA gonadotropes from same-cell multiome data highlighted dissimilar gene expression and chromatin accessibility patterns.Our newly collected single cell data characterize the difference between NFPA tumor cells and normal gonadotrope cells. Our results suggest a Warburg-effect like difference in aerobic respiratory capacity among tumor cells within the same pituitary adenoma. Overall, this study brings new insights into the molecular characteristics of NFPAs, including tumor cell heterogeneity, and represents a new resource for the field.

Show Abstract

Minimal motifs for habituating systems

M. Smart, S. Shvartsman, Martin Mönnigmann

Habituation—a phenomenon in which a dynamical system exhibits a diminishing response to repeated stimulations that eventually recovers when the stimulus is withheld—is universally observed in living systems from animals to unicellular organisms. Despite its prevalence, generic mechanisms for this fundamental form of learning remain poorly defined. Drawing inspiration from prior work on systems that respond adaptively to step inputs, we study habituation from a nonlinear dynamics perspective. This approach enables us to formalize classical hallmarks of habituation that have been experimentally identified in diverse organisms and stimulus scenarios. We use this framework to investigate distinct dynamical circuits capable of habituation. In particular, we show that driven linear dynamics of a memory variable with static nonlinearities acting at the input and output can implement numerous hallmarks in a mathematically interpretable manner. This work establishes a foundation for understanding the dynamical substrates of this primitive learning behavior and offers a blueprint for the identification of habituating circuits in biological systems.

Show Abstract

Classification-Denoising Networks

Louis Thiry, F. Guth

Image classification and denoising suffer from complementary issues of lack of robustness or partially ignoring conditioning information. We argue that they can be alleviated by unifying both tasks through a model of the joint probability of (noisy) images and class labels. Classification is performed with a forward pass followed by conditioning. Using the Tweedie-Miyasawa formula, we evaluate the denoising function with the score, which can be computed by marginalization and back-propagation. The training objective is then a combination of cross-entropy loss and denoising score matching loss integrated over noise levels. Numerical experiments on CIFAR-10 and ImageNet show competitive classification and denoising performance compared to reference deep convolutional classifiers/denoisers, and significantly improves efficiency compared to previous joint approaches. Our model shows an increased robustness to adversarial perturbations compared to a standard discriminative classifier, and allows for a novel interpretation of adversarial gradients as a difference of denoisers.

Show Abstract

Compressing multivariate functions with tree tensor networks

J. Tindall, M. Stoudenmire, R. Levy

Tensor networks are a compressed format for multi-dimensional data. One-dimensional tensor networks -- often referred to as tensor trains (TT) or matrix product states (MPS) -- are increasingly being used as a numerical ansatz for continuum functions by "quantizing" the inputs into discrete binary digits. Here we demonstrate the power of more general tree tensor networks for this purpose. We provide direct constructions of a number of elementary functions as generic tree tensor networks and interpolative constructions for more complicated functions via a generalization of the tensor cross interpolation algorithm. For a range of multi-dimensional functions we show how more structured tree tensor networks offer a significantly more efficient ansatz than the commonly used tensor train. We demonstrate an application of our methods to solving multi-dimensional, non-linear Fredholm equations, providing a rigorous bound on the rank of the solution which, in turn, guarantees exponentially scaling accuracy with the size of the tree tensor network for certain problems.

Show Abstract

New Statistical Metric for Robust Target Detection in Cryo-EM Using 2DTM

Kexin Zhang, P. Cossio, A. Rangan, et al.

2D template matching (2DTM) can be used to detect molecules and their assemblies in cellular cryo-EM images with high positional and orientational accuracy. While 2DTM successfully detects spherical targets such as large ribosomal subunits, challenges remain in detecting smaller and more aspherical targets in various environments. In this work, a novel 2DTM metric, referred to as the 2DTM p-value, is developed to extend the 2DTM framework to more complex applications. The 2DTM p-value combines information from two previously used 2DTM metrics, namely the 2DTM signal-to-noise ratio (SNR) and z-score, which are derived from the cross-correlation coefficient between the target and the template. The 2DTM p-value demonstrates robust detection accuracies under various imaging and sample conditions and outperforms the 2DTM SNR and z-score alone. Specifically, the 2DTM p-value improves the detection of aspherical targets such as a modified artificial tubulin patch particle (500 kDa) and a much smaller clathrin monomer (193 kDa) in simulated data. It also accurately recovers mature 60S ribosomes in yeast lamellae samples, even under conditions of increased Gaussian noise. The new metric will enable the detection of a wider variety of targets in both purified and cellular samples through 2DTM.

Show Abstract