DEGGENDORF, GERMANY -
Transformer models at a glance
Transformers are state-of-the art natural language processing (NLP) neural network models. They apply the attention mechanism1 that has demonstrated the ability to learn correlations between tokens (parts of words) in text particularly well. The attention mechanism was initially published in 2017, and since then it has been further improved in various subsequent publications and applications. In mid-2020, the Generative Pre-trained Transformer 3 (GPT-3) was published by OpenAI.
The term ‘generative pre-trained transformer’ may be difficult to understand for those who are not familiar with it, so here is a brief explanation: A GPT is a transformer that generates content - which is mostly text - and which was pre-trained on a large amount of text data. GPT-3 is a powerful NLP artificial intelligence (AI) that is based on a transformer and that offers various functionalities including text classification, question answering and text generation. GPT-3 is available in different sizes, with the largest one having around 175 billion parameters.
During the last two and a half years, several competitors have been released.2 One of these is Wu Dao 2.0, which has 1.75 trillion parameters and has shown extremely promising results.3 In addition, the attention mechanism has also been applied to other data such as vision. Vision transformers, such as DALL·E 2 or Stable Diffusion, have also shown promise and have led to further improvements in the state of the art.
Ever since the release of GPT-3, rumors have been swirling about the release of its successor, GPT-4. In the meantime, ChatGPT was released in late 2022. It is a model based on the improved GPT-3.5 and has also resThe content herein is subject to copyright by The Yuan. All rights reserved. The content of the services is owned or licensed to The Yuan. Such content from The Yuan may be shared and reprinted but must clearly identify The Yuan as its original source. Content from a third-party copyright holder identified in the copyright notice contained in such third party’s content appearing in The Yuan must likewise be clearly labeled as such.