Min-Seo Kim works at the Network Science Lab at the Catholic University of Korea. The document discusses previous work on recurrent neural networks (RNNs), long short-term memory (LSTMs), and gated recurrent units (GRUs) for processing sequential data. It then introduces the Transformer, which uses self-attention rather than recurrent layers, and applies it to machine translation tasks with better performance than other models. Experiments show the Transformer achieves higher accuracy than other architectures on an English-to-German translation task and demonstrates good performance on English constituency parsing despite not being specifically tuned for that task.