GPT-4 Has Arrived — Here’s What You Need To Know Access Features Applications Weaknesses Summary Level Up Coding



Image generated by Jacob Ferus

The wait is over, GPT-4 is finally here. With increased context length, more advanced reasoning and the potential of processing visual input, we’re in for a treat.

Let’s dive in.

You’ll be able to try it out if you might have ChatGPT Plus or join the waitlist for the API. For now, only text input is offered publicly, as image input remains to be in research preview where they’re collaborating with Be My Eyes, an app that assists blind and low-vision people using tech. Here’s how they use GPT-4:

Increased context size

The context size tells us how much information a GPT model is capable of process and produce and was previously limited to 4097 tokens or roughly 3072 words. This meant that in case you desired to process content that was longer than this, you would need to utilize different tricks, akin to iterative summarization. In practice though, it’s not possible to attain the identical performance as processing every thing in a single go would, each when it comes to results and speed.

The brand new base GPT-4 model could have this context limit doubled, with roughly 6144 words. Higher yet, also they are providing limited access to a model with a context size of 32768 tokens or This is big.


The AI isn’t any longer limited to text input. It may well now understand and process images together with text to generate descriptions, categorizations, and other analyses with comparable capabilities because it does with only text.

Here’s an example from OpenAI’s developer live stream where a photograph was processed featuring a hand-written mockup of an app:

More examples:

As noted, it remains to be not publicly available yet. Moreover, while it might probably take images as input, it isn’t able to generating images. The output remains to be only text.

Increased capabilities

GPT-4 exhibits enhanced collaborative and creativity capabilities compared to its predecessors but additionally improved reasoning. While GPT-3.5 was amazing at different tasks, it lacked the power to logically solve some problems that were different from its training data. I wrote one article illustrating a few of these weaknesses:

During evaluations, GPT-4 has displayed clear improvements, with the power to unravel tougher problems than GPT-3.5. For example, it was capable of pass a simulated bar exam in the highest 10% of test participants, compared to GPT-3.5 which was in the underside 10%.


Not surprisingly, the educational experiences gained from letting the general public test ChatGPT have led to a model that’s less vulnerable to “going rogue” and acting outside of the predetermined instructions. The present model has been improved to keep on with the initial instructions and is 82% less likely to reply to disallowed content.

Not only that, GPT-4 achieved a 40% higher rating on a set of factual evaluations than GPT-3.5. Apparently, GPT-4 itself was used to provide training data with a view to improve the security of the model.

For most people, every thing with GPT-4 is latest. But OpenAI has been cooperating with various firms for a while, akin to Duolingo, Be My Eyes and Khan Academy to utilize GPT-4.

Listed below are some early examples. Doing taxes:

Analyzing smart contracts:

Constructing easy games in seconds:

While it is taken into account safer and fewer error-prone, the weaknesses of GPT-3.5, akin to hallucinations and bias, still exist. Similarly, while it showed great results on the bar exam, it achieved poor ends in the Codeforces programming contest, where it had a rating of 392 (below the fifth percentile). It is usually stated that it might probably be “confidently improper in its predictions”.

There was loads of speculation about GPT-4, but now we finally have it in front of us. Probably the most impressive feature is the multimodality, which is a vital step to attain any type of artificial general intelligence.

The evaluations show it’s higher than previous models, however it’s difficult to say by how much in practice. Because the model is rolled out to the general public, I feel we are going to slowly get a greater and higher feel for a way capable the model is as more use cases surface. Similarly, we may even get an idea of what limitations still exist, and if there are other problems, like those we saw with the early release of Bing.


What are your thoughts on this topic?
Let us know in the comments below.


0 0 votes
Article Rating
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

Would love your thoughts, please comment.x