#ml

Jelassi S, Brandfonbrener D, Kakade SM, Malach E. Repeat after me: Transformers are better than state space models at copying. arXiv [cs.LG]. 2024. Available: http://arxiv.org/abs/2402.01032

Not surprising at all when you have direct access to a long context. But hey, look at this title.
 
 
Back to Top