Home Artificial Intelligence What GPT-4 Brings to the AI Table

What GPT-4 Brings to the AI Table

What GPT-4 Brings to the AI Table

Natural Language Processing

A language model and more

Image from Unsplash

The long-awaited release of the most recent Generative Pre-trained Transformers (GPT) model has finally come. The fourth release of OpenAI’s GPT model has seen some improvements from its previous versions, along with some prolonged features. GPT-4, like its predecessors, was trained and fine-tuned on a corpus of text using . The semi-supervised training utilized in GPT models is finished in a two-step process: an unsupervised generative pre-training and a supervised discriminative fine-tuning. These training steps helped to avoid the language understanding barriers that other language models faced attributable to poorly annotated data.

How GPT-4 got this far

OpenAI released GPT-4 on 14th March, 2023, nearly five years after the initial lunch of GPT-1. There have been some improvements within the speed, understanding and reasoning of those models with each latest release. Much of the improvements on these models could possibly be attributed to the quantity of knowledge utilized in the training process, the robustness of the model and the brand new advances in computing devices. GPT-1 had access to barely 4.5GB of text from BookCorpus during training. GPT-1 model had a parameter size of 117 million — which was by far massive in comparison with other language models existing on the time of its release. GPT-1 outperformed other language models in the several tasks it was fine-tuned on. These tasks were on natural language inference, query answering, semantic similarity and classification tasks.

Those that were still uncertain about the potential for a model surpassing GPT-1 were blown away by the numbers GPT-2 had on its release. The parameter size and the text size utilized in training were roughly ten times the scale seen on GPT-1. The dimensions of GPT-2 wasn’t the one latest addition. In contrast to GPT-1, OpenAI removed the necessity for a further fine-tuning step for specific tasks. Few shots learning was used to make sure that GPT-2 was in a position to attribute meaning and context to words with no need to come across the words multiple times.

Identical to GPT-2, GPT-3 and other subsequent language models don’t require additional fine-tuning on specific tasks. The 175 billion parameter model of GPT-3 was trained on 570GB of text from Common Crawl, Web Text, English Wikipedia and a few books corporal. The language understanding and reasoning of GPT-3 were profound, and further improvements led to the event of ChatGPT, an interactive dialogue API. OpenAI developed ChatGPT to enable a web-based dialogue environment for users to have a first-hand experience of the capabilities of the prolonged GPT-3 by making the language model converse and reply to users based on inputs from the user. A user can ask a matter or request detailed details about just any topic inside the training scope of the model. OpenAI moreover regulated the extent of data their models could provide. There was a bit of additional care in answers regarding prompts involving crime, weapons, adult content, etc.

Exciting features of GPT-4

Each latest release of GPT comes with a set of features that might have seemed not possible previously. ChatGPT impressed users with its level of reasoning and comprehension. Users were in a position to get accurate responses to their queries on any topic, so long as the subject material was a part of the text ChatGPT was trained on. There have been cases where ChatGPT struggled to reply to queries on the events that occurred after when the model was trained. The problem in understanding novel topics must be expected since NLP models regurgitate texts and check out to map entities inside time and space of appearance to suit the specified context. Due to this fact, only topics existing within the dataset it was trained on may be recalled, and it might be quite ambitious to generalize on latest topics.

Not only was the reasoning of the GPT-3 model relatively limited, however the model was unimodal. Only sequences of texts may be processed by this model. The newest release of GPT comes with improvements on the previous release. As a consequence of its higher level of reasoning, GPT-4 models could make higher estimates of sentence context and make general understanding based on this context. Based on the glimpse of the capabilities of this latest model, other latest features are as follows;

  • A rise in its word limit, with a word limit size of 25,000 in comparison with the three,000-word limit on ChatGPT. GPT-4 has an increased context window, with a size of 8,129 and 32,768 tokens in comparison with 4,096 and a couple of,049 tokens on GPT-3.
  • Improvements in reasoning and understanding. Texts are well understood and, higher reasoning is performed on texts.
  • GPT-4 is multi-modal. It accepts text inputs in addition to images. GPT-4 recognizes and understands a picture’s contents and might make logical deductions from the image with human-level accuracy.
  • Texts generated on GPT-4 are tougher to be flagged as machine-generated text. The texts have been more human-generated and make use of sentence features like emojis to make texts feel more personal and instill a little bit of emotion within the text.
  • Lastly, I would really like to single out the brand new dynamic logo that comes with GPT-4. The brand shows how variable this model is and the dynamism in its potential use cases. I believe the brand needs to be probably the greatest identities given to a model.

Truths and myths

Visual representation of the scale of GPT-4

Sooner or later throughout the wait for the discharge of GPT-4, this picture was in circulation on Twitter. The image is a visible representation of the rumoured size of GPT-4. The image shows a substantial increase in the scale of the parameters of the brand new model in comparison with the scale of the parameters utilized in ChatGPT. While the representation communicated by this image might sound groundbreaking, it may not be entirely true. Even OpenAI’s CEO has debunked the rumours in regards to the size of the model. The official documentation of the architecture and the scale of the model parameters utilized in training the multi-modal language model has not been released. We are able to’t really tell if the approach utilized in creating this model was by scaling the past models or some latest approach. Some AI experts argue that scaling wouldn’t provide the much-needed General Intelligence the AI world is striving towards.

OpenAI presented the massive strengths of GPT-4 in text generation, but have we bothered to ask how good the generated texts are in comparison with some standard exams? GPT-4, while performing quite well in some exams, faltered in exams that required higher level of reasoning. The technical report released by Open AI showed that GPT-4 was at all times within the 54th percentile of the Graduate Record Examination (GRE) Writing for the 2 versions of GPT-4 that was released¹. This exam is one among many exams that tests the reasoning and writing abilities of a graduate. It may possibly be said that the text generation from GPT-4 is barely nearly as good as a university graduate, which isn’t bad for a “computer”. We may say that this language model doesn’t like math, or slightly, it doesn’t do well in calculus. It performed within the forty third — 59th percentile of the AP Calculus BC exam, which is sort of low in comparison with the high percentile scores seen within the Biology, History, English, Chemistry, Psychology and Statistics counterparts of the identical exam board. The model falters with increasing levels of difficulty. Humans are still at the highest echelon of considering in the meanwhile.

Ever cared to wonder how well these language models perform in coding? GPT-4 coding abilities were checked on some Leetcode tasks. The final performance on the simple tasks was quite good, but there’s a relentless decline in its performance with a rise in difficulty within the tasks. It is usually value noting that the general rating of GPT-4 on Leetcode tasks is nearly just like that of GPT-3. OpenAI definitely didn’t do higher this time or they were possibly not attempting to turn GPT models into the following Github Copilot. Imagine a pc performing higher than a median programmer on interview coding questions. Crazy!

While some features didn’t see many improvements in comparison with the predecessor model, it’s value noting how well the model performs on other tasks.


This fourth release of GPT has shown that there isn’t any limit on the scope of language models since these models are usually not multi-modal and might accept inputs apart from texts. This could possibly be seen as a harbinger of more advanced features in versions to come back. We probably could have a language model performing as well and even higher than computer vision models in image recognition tasks with the capabilities shown by GPT-4 image understanding. We’re progressively moving towards General Artificial Intelligence. It’s still a good distance there, but we clearly have a direction and a way of where we’re heading.


Please enter your comment!
Please enter your name here