Introduction to Generative AI

Artificial Intelligence

Introduction to Generative AI

admin

June 13, 2023

or generative AI is the brand new hot topic out there. From chat-GPT writing your assignments, DALL-E creating art, its complexity is growing as rapidly as its use case. So let’s break down the technology.

We start by understanding

will be understood as a disciple or a field of study, identical to physics and sociology. Like other subjects, it has a broad spectrum of topics under its belt. ML or Machine Learning is certainly one of the subfields that we can be zooming in on. Although, ML is more about Statistical Math than Computer Science, but each are equally essential for the never-ending use case.

Consider a as a magic box, you set in a carrot (input data) rotate the box once (run the function) and pull out a rabbit (output data). The carrot is like several other carrot within the farm and the rabbit is nothing magical either, it’s the structure and making of the magic box that needs a better inspection.

There are many different ML models (or magic box) but what’s common across all these models is data. ML model needs data like a automobile needs fuel; data is paramount. We categorise these ML models based on the type of information and the way it’s fed to the system. Machine Learning will be categorised into just a few more topics; Supervised, Unsupervised, Semi-supervised and Reinforcement learning. To place it in easy terms;

Supervised Learning Models are models that use labelled data.

A teacher shows this (assuming the scholars know the best way to count):

f(0)  = 2
f(3)  = 5
f(12) = 14
f(48) = 50
f(30) = 32
f(2)  = 4

The scholar knows there’s an input inside ‘ f() ’ and an output after ‘ = ’ and for each input the teacher is adding 2 is and so they can perform x + 2 based what he recognised.

Unsupervised Learning Models are the models where the info shouldn’t be labelled as input or output.

A teacher shows this to her students (assuming the scholars know the best way to count):

Although the scholar doesn’t understand what the symbol means, they recognise the pattern and put ‘ > ’when the left number is bigger and ‘ < ’ when the appropriate number is bigger.

Semi-supervised learning is a mixture of each.

say the scholar is studying for a test, they’ll get questions like “if x > 9 and x + y = 10, solve for x”. They’ll use slightly little bit of supervised learning (addition) and slightly little bit of unsupervised learning(greater than).

Let’s get an summary of .

is a shade of semi-supervised learning. In actual fact, it’s a subset of artificial inteligence inspired by the human brain. In case you know slightly neurology, you’d know that our brain is wired in a way that there are several billion neurons or nodes creating an enormous mesh network sending and receiving data to know and communicate higher. (To be fair, it’s much more complex than that but humour me). Deep Learning works like that too. we are able to classify these models into Discriminative kind and the Generative kind.

Discriminative Deep Learning Models

Discriminative models are classification or prediction models. They’re typically supervised. They learn the connection between the feature of information points and labels. The output of such models are generally numbers, categories, probability or a category. For example; a model that may predict if an image is of a cat or dog, is a discriminative model.

Generative Deep Learning Model

Generative Models are models that generate recent content based on previously fed data. They’re typically unsupervised and might generate output like natural language, a picture, audio or video. For instance, given enough pictures of cats, the model can generate an image of a cat.

We’ll again zoom in slightly more on

is a subset of Deep Learning. It combines artificial neural networks with semi-structured data. Gen-AI generates recent content based on dataset provided to it. Generative AI, like other Machine Learning model, feeds on data. There are 2 kinds of information provided to the model.

— That is consumed for pre-training a generative transformer model also called as a foundational model, which recognises and adapts to patterns and learns “ what to generate”. Consider it like dancing; a dancer watch and learns 10 different styles from 1000’s of videos. They now know the fundamental pattern of every dance form and give you their very own choreography to recent songs. The extensive videos to check dance was the training data.
— That is primarily given when it comes to prompts, or small texts to supply contexts and constraints for the brand new content to be generated. Going back to the dancing example, a prompt could be an individual asking the dancer to perform classical dancr for two minutes.

Identical to us humans, machines too, misunderstand patterns and jumps to incorrect conclusions too. And identical to humans, when machine answers like a crazy person, we are saying that the machine had “”. Now hallucinations can occur largely resulting from 4 reasons:

The training data was too little.
The training data was noisy or dirty.
The prompt data did not have enough context.
The prompt data did not have enough constraints.

One other thing to take note is that the standard of the input or the prompt highly influences the standard of the output. Hence Based on the input and expected output, there are multiple kinds of Generative AI model as well. Let’s look into a few of these models and their examples.

Text to Text

Generation, classification, summarisation, translations, (re)search all are text to text models. where input and output each are texts, chat-GPT and BARD are the prime examples of those model types.

Text to Image

These models take text as an input and provides a picture as an output, image generation and image editing are prime use cases today. There are plenty of controversies on copyright regulations around these images which might be creating panic in each the art and technology world in the intervening time.

Text to Video / 3D

Video generation and editing has been a pain for many creators, now with text to video or text to 3D gen-AI models, their life can be slightly more simpler. At the identical time, game developers can quickly create game assets and non playable characters using just text prompts. Even 3D modelling and rendering animations has turn into easier with Gen-AI.

Text to Task

Text to task has been within the industry for some time in type of virtual assistant, automation and software agents. What’s recent, with Gen-AI is that the tasks needn’t be custom-made and saved on these assistants, slightly the system needs to be smart enough to adapt and complete the tasks by itself.

Generative AI, is the following big thing on the planet. It would just burst just like the .com bubble or it’d revolutionise just like the Industrial Revolution. While we wait for either to occur, we are able to learn more concerning the What, How and Why of what’s coming next.