Large language models use a surprisingly easy mechanism to retrieve some stored knowledge

Artificial Intelligence

Large language models use a surprisingly easy mechanism to retrieve some stored knowledge

admin

March 25, 2024

Large language models use a surprisingly easy mechanism to retrieve some stored knowledge

Large language models, resembling those who power popular artificial intelligence chatbots like ChatGPT, are incredibly complex. Although these models are getting used as tools in lots of areas, resembling customer support, code generation, and language translation, scientists still don’t fully grasp how they work.

In an effort to raised understand what is happening under the hood, researchers at MIT and elsewhere studied the mechanisms at work when these enormous machine-learning models retrieve stored knowledge.

They found a surprising result: Large language models (LLMs) often use a quite simple linear function to recuperate and decode stored facts. Furthermore, the model uses the identical decoding function for similar kinds of facts. Linear functions, equations with only two variables and no exponents, capture the simple, straight-line relationship between two variables.

The researchers showed that, by identifying linear functions for various facts, they will probe the model to see what it knows about recent subjects, and where inside the model that knowledge is stored.

Using a way they developed to estimate these easy functions, the researchers found that even when a model answers a prompt incorrectly, it has often stored the proper information. In the long run, scientists could use such an approach to search out and proper falsehoods contained in the model, which could reduce a model’s tendency to sometimes give incorrect or nonsensical answers.

“Although these models are really complicated, nonlinear functions which are trained on a lot of data and are very hard to grasp, there are sometimes really easy mechanisms working inside them. That is one instance of that,” says Evan Hernandez, an electrical engineering and computer science (EECS) graduate student and co-lead creator of a paper detailing these findings.

Hernandez wrote the paper with co-lead creator Arnab Sharma, a pc science graduate student at Northeastern University; his advisor, Jacob Andreas, an associate professor in EECS and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); senior creator David Bau, an assistant professor of computer science at Northeastern; and others at MIT, Harvard University, and the Israeli Institute of Technology. The research shall be presented on the International Conference on Learning Representations.

Finding facts

Most large language models, also called transformer models, are neural networks. Loosely based on the human brain, neural networks contain billions of interconnected nodes, or neurons, which are grouped into many layers, and which encode and process data.

Much of the knowledge stored in a transformer may be represented as relations that connect subjects and objects. As an example, “Miles Davis plays the trumpet” is a relation that connects the topic, Miles Davis, to the article, trumpet.

As a transformer gains more knowledge, it stores additional facts a couple of certain subject across multiple layers. If a user asks about that subject, the model must decode probably the most relevant fact to answer the query.

If someone prompts a transformer by saying “Miles Davis plays the. . .” the model should respond with “trumpet” and never “Illinois” (the state where Miles Davis was born).

“Somewhere within the network’s computation, there needs to be a mechanism that goes and appears for the incontrovertible fact that Miles Davis plays the trumpet, after which pulls that information out and helps generate the following word. We wanted to grasp what that mechanism was,” Hernandez says.

The researchers arrange a series of experiments to probe LLMs, and located that, despite the fact that they’re extremely complex, the models decode relational information using a straightforward linear function. Each function is restricted to the variety of fact being retrieved.

For instance, the transformer would use one decoding function any time it desires to output the instrument an individual plays and a distinct function every time it desires to output the state where an individual was born.

The researchers developed a technique to estimate these easy functions, after which computed functions for 47 different relations, resembling “capital city of a rustic” and “lead singer of a band.”

While there may very well be an infinite variety of possible relations, the researchers selected to review this specific subset because they’re representative of the sorts of facts that may be written in this fashion.

They tested each function by changing the topic to see if it could recuperate the proper object information. As an example, the function for “capital city of a rustic” should retrieve Oslo if the topic is Norway and London if the topic is England.

Functions retrieved the proper information greater than 60 percent of the time, showing that some information in a transformer is encoded and retrieved in this fashion.

“But not every little thing is linearly encoded. For some facts, despite the fact that the model knows them and can predict text that’s consistent with these facts, we are able to’t find linear functions for them. This means that the model is doing something more intricate to store that information,” he says.

Visualizing a model’s knowledge

In addition they used the functions to find out what a model believes is true about different subjects.

In a single experiment, they began with the prompt “Bill Bradley was a” and used the decoding functions for “plays sports” and “attended university” to see if the model knows that Sen. Bradley was a basketball player who attended Princeton.

“We will show that, despite the fact that the model may decide to give attention to different information when it produces text, it does encode all that information,” Hernandez says.

They used this probing technique to supply what they call an “attribute lens,” a grid that visualizes where specific details about a selected relation is stored inside the transformer’s many layers.

Attribute lenses may be generated robotically, providing a streamlined method to assist researchers understand more a couple of model. This visualization tool could enable scientists and engineers to correct stored knowledge and help prevent an AI chatbot from giving false information.

In the long run, Hernandez and his collaborators want to raised understand what happens in cases where facts are usually not stored linearly. They might also prefer to run experiments with larger models, in addition to study the precision of linear decoding functions.

“That is an exciting work that reveals a missing piece in our understanding of how large language models recall factual knowledge during inference. Previous work showed that LLMs construct information-rich representations of given subjects, from which specific attributes are being extracted during inference. This work shows that the complex nonlinear computation of LLMs for attribute extraction may be well-approximated with a straightforward linear function,” says Mor Geva Pipek, an assistant professor within the School of Computer Science at Tel Aviv University, who was not involved with this work.

This research was supported, partially, by Open Philanthropy, the Israeli Science Foundation, and an Azrieli Foundation Early Profession Faculty Fellowship.

LEAVE A REPLY Cancel reply