Machine Learning at the Flatiron Institute Seminar: Clément Hongler
Title: Arrows of Time for Large Language Models
Abstract: Large Language Models famously predict the next token in a text. What happens if we teach them to predict the next word? It turns out that some subtle differences emerge. I will discuss some empirical and theoretical results about this, and also some (hopefully exciting) consequences and perspectives suggested by our results.