Attention is all you get


Ginsparg, P. (2019). Attention is all you get. Perimeter Institute. https://pirsa.org/19070013


Ginsparg, Paul. Attention is all you get. Perimeter Institute, Jul. 11, 2019, https://pirsa.org/19070013


          @misc{ pirsa_PIRSA:19070013,
            doi = {10.48660/19070013},
            url = {https://pirsa.org/19070013},
            author = {Ginsparg, Paul},
            keywords = {Condensed Matter},
            language = {en},
            title = {Attention is all you get},
            publisher = {Perimeter Institute},
            year = {2019},
            month = {jul},
            note = {PIRSA:19070013 see, \url{https://pirsa.org}}

Paul Ginsparg Cornell University


For the past decade, there has been a new major architectural fad in deep learning every year or two. One such fad for the past two years has been the transformer model, an implementation of the attention method which has superseded RNNs in most sequence learning applications. I'll give an overview of the model, with some discussion of non-physics applications, and intimate some possibilities for physics.