Like its predecessor, GPT-2, it is a decoder-only[2] transformer model of deep neural network, which supersedes recurrence and convolution-based architectures with a technique known as "attention".
[5] According to The Economist, improved algorithms, more powerful computers, and a recent increase in the amount of digitized material have fueled a revolution in machine learning.
[6] Software models are trained to learn by using thousands or millions of examples in a "structure ... loosely based on the neural architecture of the brain".
[7] There are a number of NLP systems capable of processing, mining, organizing, connecting and contrasting textual input, as well as correctly answering questions.
[8] On June 11, 2018, OpenAI researchers and engineers published a paper introducing the first generative pre-trained transformer (GPT)—a type of generative large language model that is pre-trained with an enormous and diverse text corpus in datasets, followed by discriminative fine-tuning to focus on a specific task.
On May 28, 2020, an arXiv preprint by a group of 31 engineers and researchers at OpenAI described the achievement and development of GPT-3, a third-generation "state-of-the-art language model".
[1][12] The team increased the capacity of GPT-3 by over two orders of magnitude from that of its predecessor, GPT-2,[13] making GPT-3 the largest non-sparse language model to date.
Sixty percent of the weighted pre-training dataset for GPT-3 comes from a filtered version of Common Crawl consisting of 410 billion byte-pair-encoded tokens.
[18] According to one user, who had access to a private early release of the OpenAI GPT-3 API, GPT-3 was "eerily good" at writing "amazingly coherent text" with only a few simple prompts.
"[1]: 34 In their May 28, 2020 paper, the researchers described in detail the potential "harmful effects of GPT-3"[12] which include "misinformation, spam, phishing, abuse of legal and governmental processes, fraudulent academic essay writing and social engineering pretexting".
[1] In June 2022, Almira Osmanovic Thunström wrote that GPT-3 was the primary author on an article on itself, that they had submitted it for publication,[24] and that it had been pre-published while waiting for completion of its review.
The model attempted to provide developers and users with an advanced natural language processing tool that can effectively retrieve and synthesize online information.
[36] This feature allows users to ask questions or request information with the expectation that the model will deliver updated, accurate, and relevant answers based on the latest online sources available to it.
The agreement permits OpenAI to offer a public-facing API such that users can send text to GPT-3 to receive the model's output, but only Microsoft will have access to GPT-3's source code.
[67] OpenAI's GPT series was built with data from the Common Crawl dataset,[68] a conglomerate of copyrighted articles, internet posts, web pages, and books scraped from 60 million domains over a period of 12 years.
TechCrunch reports this training data includes copyrighted material from the BBC, The New York Times, Reddit, the full text of online books, and more.
[69] In its response to a 2019 Request for Comments on Intellectual Property Protection for Artificial Intelligence Innovation from the United States Patent and Trademark Office (USPTO), OpenAI argued that "Under current law, training AI systems [such as its GPT models] constitutes fair use," but that "given the lack of case law on point, OpenAI and other AI developers like us face substantial legal uncertainty and compliance costs.