Home Artificial Intelligence The Tower of Mind, towards a greater ChatGPT Considering Fast and Slow The ocean beneath I swear I’m accountable for this mess The all essential world model System 2, the slow and regular snail System 1, the fast and at all times ready cheetah The snail becomes a cheetah The memory lane Looking ahead

The Tower of Mind, towards a greater ChatGPT Considering Fast and Slow The ocean beneath I swear I’m accountable for this mess The all essential world model System 2, the slow and regular snail System 1, the fast and at all times ready cheetah The snail becomes a cheetah The memory lane Looking ahead

5
The Tower of Mind, towards a greater ChatGPT
Considering Fast and Slow
The ocean beneath
I swear I’m accountable for this mess
The all essential world model
System 2, the slow and regular snail
System 1, the fast and at all times ready cheetah
The snail becomes a cheetah
The memory lane
Looking ahead

How latest architectural paradigms can fix the restrictions of systems like ChatGPT

The Tower of Mind | Infographic by Javier ideami | ideami.com

has taken the world by storm. And yet, its limitations are many. It often and presents straight out lies as in the event that they were facts.

In a way, it’s like our human intuition: fast, powerful and assured, sometimes too confident for its own good!

Recently, , one among the godfathers of AI, published a captivating paper titled “”. On this theoretical paper, he outlines a proposal for a latest sort of multimodular architecture that might fix lots of the restrictions of systems like ChatGPT and move us closer to AGI.

I even have combined parts of Yann’s paper (which is moderately long at 60 pages) with one among my visual metaphors to be able to produce the Tower of Mind infographic, which simplifies and visually expresses some key parts of his proposal.

Throughout the infographic, we will likely be reviewing key elements of how our human mind works and concurrently we’ll connect those elements with this potential future AI architecture proposed by Yann LeCun. We’ll see that there are plenty of parallelisms and connections to be made between Yann’s proposal and the best way the brain works. Let’s begin!

At the fitting side of the infographic we establish a key objective of this latest architectural paradigm: to implement two alternative ways of processing information, two ways or modes that match the best way our brain works, the modes that nobel prize Daniel Kahneman explains so well in his famous book “Considering Fast and Slow”: and .

The Tower of Mind | Infographic by Javier ideami | ideami.com

There’s an evolutionary reason for these two ways of processing information to exist in our brain, and maybe, in most intelligent systems.

At any time when we encounter a latest scenario (for which we haven’t previously learnt a response pattern), we’d like to search out the sequence of steps, the algorithm to resolve such a scenario. For this, we employ what Kahneman calls System-2. We slowly search for that latest algorithm, paying close attention in a scientific and conscious way as we slowly reason our approach to a latest response pattern.

The issue with our System-2 is that it’s slow and expensive. It requires plenty of cognitive effort and fuel (the glucose that powers our brain). We cannot use such a mode of considering all day. Due to this fact, once we learn a latest response pattern, we proceed to automate it. We make it subconscious. We transfer it from System-2 to System-1.

Our System-1 is fast and less expensive. It is usually subconscious and powers our intuition and the associative machinery that has lots to do with what we call “Creativity”. It quickly connects perceptions with actions, or goals with response patterns.

Metaphorically speaking, our subconscious is sort of a kitchen pot that mixes and recombines the data we absorb (which we also compress and abstract).

Image AI generated by Javier ideami | ideami.com

Throughout the subconscious realm, System-1 processes quickly map inputs to outputs, bypassing the slow System-2.

When a tiger is about to eat us, we’ve no time to consciously reflect on what to do. We must map perception to motion immediately.

While System-2 is often either off or weakly lively (monitoring the impressions sent by System-1), System-1 is at all times on and lively.

Our subconscious intuition is continuously attempting to fit perception to response patterns. And when it doesn’t find an ideal fit, it gives us an approximation. Nevertheless it at all times presents its conclusions as in the event that they were facts.

It’s then as much as our System-2 to just accept or override our system-1 impressions. The way in which a typical System-1 mode works is one among the sources behind the hallucinations and lies produced by systems like ChatGPT.

It’s time to leap into the tower of mind. Let’s first take a fast take a look at the interface between the world and our perception at the underside of the tower.

The Tower of Mind | Infographic by Javier ideami | ideami.com

The ocean beneath the tower represents all of the complexity around us. In touch with that ocean of complexity, our senses perceive and absorb information. Nonetheless, such an amount of complexity cannot possibly fit into our brains. Due to this fact, we must compress it.

Inside Yann’s proposal, an encoder module compresses our perception and steadily abstracts it, discarding much of the detail and preserving its essence.

The same thing happens in our brains. For instance, in our visual cortex, the data passes through different layers that steadily extract more refined abstractions of our perception.

It’s now time to return to the very top of the tower and start our gradual descent.

The Tower of Mind | Infographic by Javier ideami | ideami.com

Consciousness isn’t necessarily a prerequisite for a sophisticated type of artificial intelligence. We don’t really know if future AI systems will or won’t be conscious. As well as, we don’t even understand what consciousness is or how it really works.

In any case, within the infographic we create a parallel between human consciousness and a possible similar thing which will exist within the AIs of the long run. A horizontal line separates conscious and subconscious processes.

The comical texts at the highest represent the degree of control that we typically feel we’ve in relation to what goes on down below, in the remaining of the tower of the mind.

Nonetheless, far more of our lives than we might imagine, is run by our System-1 processes.

Experts state that we make around 30000 decisions a day, and that we’re only conscious of 0.26% of them. Most of what we do is System-1.

More often than not we run in auto-pilot or semi-auto pilot mode. That is the explanation why, regardless that what AI has mastered today is especially limited to System-1 capabilities, it continues to be in a position to impact almost every aspect of our lives. Because, in actual fact, most of what we do is expounded to such a way of processing information.

Still, System-2, while getting used selectively and punctiliously, is an important a part of our considering. It’s what allows us to proactively find latest algorithms to resolve latest scenarios, to research and reason in systematic ways, to observe and, if obligatory, override the impressions sent by System-1, etc.

Does ChatGPT have system-2 capabilities? The AI community is type of divided regarding this query, but most experts state that even though it is feasible that it can have some sort of a rudimentary or basic model of the world internally, ChatGPT continues to be very removed from having something similar to our powerful System-2 capabilities.

What makes the situation confusing is that ChatGPT uses human language, and our language is one among the foundational pillars of our System-2. The mix of the usage of language and its massive System-1 pattern matching capabilities (way larger than ours), allows ChatGPT to sometimes very convincingly imitate our technique of reasoning.

Also, having been trained on data which incorporates many reasoning processes, ChatGPT is in a position to type of output reasoning steps if we either encourage it or help it with techniques resembling .

At the identical time, it is sort of easy to catch ChatGPT committing the sort of mistakes that reveal the absence of a complicated world model.

But, at a straightforward level, what’s a world model, and why is it so essential?

A world model is a simplified abstraction of how the world works. A superb world model permits you to:

  • Predict a future state of the world from a previous one
  • Predict next states of the world after performing simulated actions
  • Fill in missing details in the information coming from the perception module

Most significantly, a robust world model permits you to simulate and learn without having to do trial and error in the true world, which might be costly and dangerous.

In case you give it some thought, a fantastic model of the world is like having . As Yann LeCun explains, we are able to view common sense as a set of models of the world that tell us what’s plausible or likely and what’s inconceivable.

And the sort of mistakes ChatGPT often makes, reveal an absence of a complicated common sense, that’s, an absence of a complicated model of the world.

With the intention to solve these limitations, Yann LeCun proposes this latest architectural paradigm. One in every of the primary key points to notice with regard to this latest paradigm is its .

Systems like ChatGPT, although they might have some internal modules, are quite monolithic. For instance, they don’t have a dedicated memory module. They’re what we typically call end-to-end systems.

To compensate for the shortage of separate key modules and likewise for the shortage of sophisticated System-2 capabilities, all types of patches and hacks are currently getting used together with ChatGPT: plugins of every kind, from Wolfram Alpha to Zapier, vector databases, libraries that allow LLMs to change into agents able to taking (or attempting to take) autonomous decisions (see AutoGPT or BabyAGI).

These and others are temporary solutions which will eventually get replaced by robust ones just like the latest paradigm proposed by Yann LeCun.

Below I share one other of my infographics, one which summarizes the present ecosystem around LLMs.

Every prompt matters | Infographic by Javier ideami | ideami.com

You may download the “Every prompt matters” infographic at its Github repo.

Due to this fact, in contrast to the restrictions of current systems, Yann’s architecture is more akin to our human brain, being composed of quite a lot of separate modules which permit the architecture to flexibly implement processes that make system-1 and system-2 capabilities possible. We’ll soon explore these modules and the best way they work together.

Graphic from “A Path Towards Autonomous Machine Intelligence” by Yann LeCun | https://openreview.net/pdf?id=BZ5a1r-kVsf

Let’s now review the highest of the tower, continuing to determine parallels between how our mind works and this latest architectural paradigm of Yann LeCun.

The Tower of Mind | Infographic by Javier ideami | ideami.com

At the highest we’ve our attention capabilities. As humans, once we use System-2 processes to learn latest algorithms, we make use of to light up and understand the context around the data we interact with.

Modern AI Architectures also employ , a key a part of the architecture, which has revolutionized the deep learning field.

Below, you discover an in depth infographic I created about how Transformers work.

X-Ray-Transformer | Infographic by Javier ideami | ideami.com

Let’s proceed. On the of the tower of our minds lies . The word “flower” at the highest of the tower, represents every possible flower within the universe. At the other end of the tower, by the ocean, we discover the wealthy details of a selected flower.

Time to speak about . We humans, subconsciously and/or consciously set goals for ourselves. We would like to do more exercise, get a latest job, improve our relationship with anyone, or just, feel higher.

We’ll soon see how these goals connect with the remaining of the modules and the best way we may perform optimization processes to search out out the very best actions or response patterns to satisfy such goals.

Finally, Yann proposes the existence of a module. A sort of a master module that connects to all of the others and sets their parameters to be able to adapt their functionality to the present goals.

We can have something just like this module inside the prefrontal cortex of our brain, which is involved in lots of our executive functions.

Other than the configurator, the opposite major modules proposed by Yann, and which we’ll explore in a bit, are:

  • it takes the input to our senses and brings it up the tower, compressing it and abstracting it right into a of the state of the world. This latent representation could also be expressed hierarchically at different levels of abstraction. In our brain this module would correspond to parts of our visual cortex, auditory cortex, etc. The technique of compressing and abstracting our perception is represented within the infographic by the green encoder module.
  • Essentially the most complex a part of the architecture. As explained previously, it allows us to estimate missing information concerning the current state of the world, predict world states from previous ones or from actions proposed by the actor module, etc. The predictive capabilities of the world model are represented within the infographic by purple circles with white edges.
  • : It measures an amount that Yann calls “energy”, which expresses how far we’re from “comfort”, or the gap between where we’re and where we would like to be in relation to different drives and goals. The final word goal of the agent (us, within the case of humans) is to reduce this energy, this cost. The price combines two terms, Yann explains, the, which is difficult wired and computes the instantaneous present “discomfort” of the agent, and the , which is used to predict future intrinsic energies, and shortly we’ll see why that is so essential and useful. Some parts of the intrinsic cost module might be in comparison with the basal ganglia and the amygdala in humans, whereas parts of our prefrontal cortex involved in reward prediction would correspond to the trainable critic module. The price modules are represented within the infographic by fuchsia rectangular boxes.
  • : It stores useful data about past, present and future world states, along with their associated intrinsic costs. This is beneficial, for instance, to be able to train the critic module (which estimates future intrinsic costs), and afterward we’ll explain how this is able to be done. The memory module might be in comparison with the hippocampus in humans. It’s represented within the infographic by a small tower on the left side of the major structure.
  • : it creates proposals for motion sequences which may be used to resolve a latest scenario. It also sends actions to the actuators of the system. The actor is at the middle of how System-1 and System-2 processes are implemented. It has two components. One is a that quickly maps world states (derived from the perception module) to actions. That is the bottom of System-1. And the second is the motion optimizer that performs model-predictive control. That is the bottom of System-2. Afterward we’ll see how this second component, when combined with the world model and the fee module, might be used to learn latest algorithms, by steadily finding an optimum sequence of actions that minimizes the fee related to them. The actor modules are represented within the infographic by white circles with yellow edges (involved in System-2) and by yellow triangles (involved in System-1). In our brain, areas of the pre-motor cortex that take care of proposing and encoding motor plans could correspond to parts of this module.

It’s now time to descend and explore the best way System-2 capabilities would work within the paradigm Yann proposes.

The Tower of Mind | Infographic by Javier ideami | ideami.com

Let’s review. This latest architectural paradigm will fully activate System-2 processes when it needs to search out the steps, the actions, the algorithm to resolve a latest scenario, or when it desires to override the automated impressions sent by System-1.

With the intention to do that, System-2 needs to search out the fitting actions to resolve that latest scenario. And to get there, the system will perform an process to steadily find those actions.

What’s most significant to grasp is that such optimization processes will happen in a hierarchical fashion at different levels of temporal abstraction. It is because each goal the AI has, might be expressed at different temporal scales. The same thing happens with us, humans.

For instance, say I need to learn to play a latest piano piece. The response patterns I want to activate to be able to play it don’t yet exist in my brain so I want to learn them slowly and consciously using System-2.

Learning to play the piece is my major goal. But now I can subdivide it into subgoals that correspond to different parts of the piece. And now I can take each of those subgoals and further subdivide them into other goals like learning first the melody after which the accompaniment of every of them. And so forth and so forth.

So we would like to optimize the actions we’ll take with regard to every of the subgoals, hierarchically encompassing each temporal abstraction, including the master goal.

On this infographic, to simplify, we’re representing two of the degrees of such a hierarchical process.

The Tower of Mind | Infographic by Javier ideami | ideami.com

And every of those levels consists of the mix of three parts:

First, an actor module proposes a sequence of actions to tackle a certain goal (the white circles with yellow edges).

Then, the world model takes the present state of the world and the motion proposed by the actor and outputs its predicted next state of the world. It then takes that predicted next state and the next motion proposed by the actor, and predicts again the next state of the world. And so forth and so forth, steadily predicting the implications of performing the actions proposed by the actor. These predictive processes are represented by the purple circles with white edges.

The Tower of Mind | Infographic by Javier ideami | ideami.com

Great, so we propose possible actions and we predict the implications of performing them. But now, how can we link those two with our ultimate goal, which is to optimize those actions to be able to minimize the overall cost, the discomfort, the difference between where we’re and where we would like to get to?

We do it through the fee module. As explained previously, the fee module measures an amount, an energy, that expresses our degree of discomfort, how far we’re from “comfort”, from a great state where our goals have been achieved.

And what the agent wants, be it an AI or a human, is to reduce this number, to diminish the fee, to diminish that difference between where we’re and where we would like to be, to catch up with to our ideal state.

To review, Yann explains that this module consists of two parts:

  • An intrinsic cost, that expresses our present level of comfort in relation to things like hunger, pain, pleasure and similars, in addition to other needs which will arise from our goals (depending on how the configurator module has arrange the fee module).
  • A critic, which is a trainable module that’s used to predict the long run intrinsic cost connected to a certain state of the world. This could be very essential because we’re using the world model to simulate actions. Because of this we should not taking them for real. So the one approach to know their cost upfront is to predict it. And the critic is accountable for doing that.

Each predictive process (linked to a set of actions) is connected to a price process that estimates the related cost. And every intermediate cost is summed to provide in the long run one final total cost.

The Tower of Mind | Infographic by Javier ideami | ideami.com

So now we’ve:

  • Proposed actions
  • Predicted latest states of the world derived from those actions
  • A complete predicted cost that results from performing a simulation of that sequence of actions

All that continues to be is to perform an optimization process by iterating through this pipeline with the target of minimizing as much as possible the worth of the overall cost.

As we do this, we will likely be modifying and tweaking the proposed actions while we keep decreasing the overall cost at the tip of the pipeline.

Yann emphasizes that we’ve to keep in mind that this is able to be happening in parallel at different levels of temporal abstraction inside this hierarchical architecture. And as we are able to see below, different levels of temporal abstraction influence one another in key ways.

The Tower of Mind | Infographic by Javier ideami | ideami.com

The specifics of the optimization algorithm that can steadily tweak the actions with the target of decreasing the overall cost, relies on how continuous and differentiable the mapping that goes from the actions through the world model to the fee could also be.

If such mapping is continuous and smooth, we could use gradient based optimization algorithms like backpropagation.

Nonetheless, if there are discontinuities within the mappings, we might should employ gradient-free methods like dynamic programming, heuristic search techniques, combinatorial optimization, etc.

After performing the optimization process, the system has found a set of actions that enables it to achieve its goal. Now we have learnt a latest response pattern to suit with the brand new scenario.

But let’s keep in mind that System-2 is slow and expensive. Due to this fact, we would like to transfer these learnings to System-1, in order that the next times we are able to execute the learnt response pattern robotically without having to have interaction the world model and the remaining of the optimization process. Let’s due to this fact proceed our way towards System-1.

The Tower of Mind | Infographic by Javier ideami | ideami.com

The yellow ellipse represents a part of our subconscious System-1, a set of quick perception-action loops that map our sensor inputs or our abstract must sequences of actions or response patterns.

The yellow triangles represent the a part of the actor module that Yann LeCun calls “policy modules”, modules which are trained to map a certain perception or must an motion or sequence of actions.

The Tower of Mind | Infographic by Javier ideami | ideami.com

Yann tells us that this latest architectural paradigm would have one single world model (in relation to System-2), but multiple motion policy modules (in relation to System-1).

Having a single world model on this latest AI architecture allows us to reuse the related hardware in addition to share knowledge between different goals and tasks.

Is the world model in our brain the ultimate consequence of a voting process like Jeff Hawkins suggests in his book “A Thousand Brains: A Recent Theory of Intelligence”? Jeff Hawkins talks about 1000’s of models inside our cortical columns which through voting coalesce into stable predictions. It stays to be seen.

Then again, on this latest AI paradigm the subconscious System-1 processes can employ quite a few policy modules which might be trained to output different response patterns learnt by System-2 processes in response to related perceptions or connected needs.

And the way will we perform such a transfer process, from System-2 learnings to System-1 policy modules?

The white circles with yellow edges represent the actions that System-2 has learnt and optimized. Below them, we discover light blue square modules that connect each of those learnt actions with System-1 policy modules (the yellow triangles).

The Tower of Mind | Infographic by Javier ideami | ideami.com

The sunshine blue square modules estimate the gap, the difference between the actions learnt by System-2 and the actions that System-1 policy modules output in response to the related states of the world.

With the intention to perform this transfer process from System-2 to System-1, we’d like to diminish the gap between those two sorts of inputs that feed the sunshine blue modules. As we steadily decrease that distance, the output of the System-1 policy modules gets closer and closer to the actions learnt by System-2.

As that optimization process progresses, System-1 policy modules are in a position to steadily output with more precision those self same actions in response to the related states of the world. Now we have transferred the learnings from System-2 to System-1.

On the left of the infographic, we discover a little bit tower that represents the short term memory module that Yann describes in his paper.

The Tower of Mind | Infographic by Javier ideami | ideami.com

This separate memory module would correspond to the hippocampus in humans. It’s in charge, in between other things, of storing pairs of states of the world and associated costs. Storing this information for future retrieval is vital to be able to, for instance, have the ability to coach the very essential critic module. Let’s review why.

Do not forget that to be able to learn, System-2 proposes some actions, simulates them, predicts the brand new states of the world derived from that simulation, after which predicts the overall related cost. It then steadily optimizes that pipeline by tweaking the related actions within the direction that minimizes the overall cost.

The critic module is accountable for estimating upfront the prices connected to the states that result from the simulations, and it must be trained to perform those predictions in addition to possible.

By accessing the short term memory, we are able to pick a state of the world and its related intrinsic cost, and compare that cost with the one predicted by the critic. We will then perform an optimization process to diminish the gap between what the critic predicts and the proper cost stored in memory. By doing this over and over, we will likely be training our critic to estimate in higher ways future costs connected with simulated actions.

In this text and infographic we’ve explored among the key areas of this latest architectural paradigm presented by Yann LeCun. Concurrently, we’ve reflected on the best way our mind works, which has lots to do with this latest proposal by Yann.

But Yann’s paper is pretty long at 60 pages and includes many more details. An enormous chunk of the paper centers on design and train the world model, which incorporates its Joint Embedding Predictive Architecture (JEPA), probably the most complex a part of the system. If you need to go deeper into his research and paper, you could explore it in depth here:
https://openreview.net/pdf?id=BZ5a1r-kVsf

Finally, you’ll be able to download the Tower of Mind infographic in very prime quality at the next Github repo:

This infographic was first presented during a chat of the identical name proposed by Instituto de Inteligencia Artificial in Spain (iia.es) and which I gave to the corporate Roams. Roams, directed by Eduardo Delgado, is probably the greatest examples of great entrepreneurship in Spain, an organization that puts talent and folks above every part else.

The Tower of Mind | Infographic by Javier ideami | ideami.com
Image AI generated by Javier ideami | ideami.com

5 COMMENTS

  1. … [Trackback]

    […] Find More here to that Topic: bardai.ai/artificial-intelligence/the-tower-of-mind-towards-a-greater-chatgptconsidering-fast-and-slowthe-ocean-beneathi-swear-im-accountable-for-this-messthe-all-essential-world-modelsystem-2-the-slow-and-regular-sn…

  2. … [Trackback]

    […] Find More Information here to that Topic: bardai.ai/artificial-intelligence/the-tower-of-mind-towards-a-greater-chatgptconsidering-fast-and-slowthe-ocean-beneathi-swear-im-accountable-for-this-messthe-all-essential-world-modelsystem-2-the-slow-an…

  3. … [Trackback]

    […] There you can find 37309 additional Info to that Topic: bardai.ai/artificial-intelligence/the-tower-of-mind-towards-a-greater-chatgptconsidering-fast-and-slowthe-ocean-beneathi-swear-im-accountable-for-this-messthe-all-essential-world-modelsystem…

  4. … [Trackback]

    […] Information to that Topic: bardai.ai/artificial-intelligence/the-tower-of-mind-towards-a-greater-chatgptconsidering-fast-and-slowthe-ocean-beneathi-swear-im-accountable-for-this-messthe-all-essential-world-modelsystem-2-the-slow-and-regular-snai/…

LEAVE A REPLY

Please enter your comment!
Please enter your name here