334
Publications
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
Frederik Kunstner, Robin Yadav, Alan Milligan, Mark Schmidt, A. Bietti
Good Rates From Bad Coordinates: The Exponential Average Time-dependent Rate Approach
Nicodemo Mazzaferro, Subarna Sasmal, P. Cossio, Glen M. Hocky
AstroCLIP: a cross-modal foundation model for galaxies
Liam Parker , Francois Lanusse, Siavash Golkar, Leopoldo Sarra, Miles Cranmer, A. Bietti, Michael Eickenberg, Geraud Krawezik, Michael McCabe , R. Morel, R. Ohana, B. Régaldo-Saint Blancard, et al.
High-order and adaptive optical conductivity calculations using Wannier interpolation
Lorenzo Van Muñoz, J. Kaye, A. Barnett, Sophie Beck
How Truncating Weights Improves Reasoning in Language Models
Lei Chen, Joan Bruna, A. Bietti
Contextual Counting: A Mechanistic Study of Transformers on a Quantitative Task
Siavash Golkar, A. Bietti, Mariel Pettee, Michael Eickenberg, et al.
Neurosift: DANDI exploration and NWB visualization in the browser
J. Magland, J. Soules, Cody Baker, Benjamin Dichter
- Previous Page
- Viewing
- Next Page