Research has shown that enormous language models (LLMs) are inclined to overemphasize information initially and end of a document or conversation, while neglecting the center.
This “position bias” implies that, if a lawyer is using an LLM-powered virtual assistant to retrieve a certain phrase in a 30-page affidavit, the LLM is more more likely to find the appropriate text whether it is on the initial or final pages.
MIT researchers have discovered the mechanism behind this phenomenon.
They created a theoretical framework to check how information flows through the machine-learning architecture that forms the backbone of LLMs. They found that certain design decisions which control how the model processes input data could cause position bias.
Their experiments revealed that model architectures, particularly those affecting how information is spread across input words inside the model, can provide rise to or intensify position bias, and that training data also contribute to the issue.
Along with pinpointing the origins of position bias, their framework might be used to diagnose and proper it in future model designs.
This could lead on to more reliable chatbots that stay on topic during long conversations, medical AI systems that reason more fairly when handling a trove of patient data, and code assistants that pay closer attention to all parts of a program.
“These models are black boxes, in order an LLM user, you most likely don’t know that position bias could cause your model to be inconsistent. You simply feed it your documents in whatever order you would like and expect it to work. But by understanding the underlying mechanism of those black-box models higher, we are able to improve them by addressing these limitations,” says Xinyi Wu, a graduate student within the MIT Institute for Data, Systems, and Society (IDSS) and the Laboratory for Information and Decision Systems (LIDS), and first creator of a paper on this research.
Her co-authors include Yifei Wang, an MIT postdoc; and senior authors Stefanie Jegelka, an associate professor of electrical engineering and computer science (EECS) and a member of IDSS and the Computer Science and Artificial Intelligence Laboratory (CSAIL); and Ali Jadbabaie, professor and head of the Department of Civil and Environmental Engineering, a core faculty member of IDSS, and a principal investigator in LIDS. The research will probably be presented on the International Conference on Machine Learning.
Analyzing attention
LLMs like Claude, Llama, and GPT-4 are powered by a kind of neural network architecture referred to as a transformer. Transformers are designed to process sequential data, encoding a sentence into chunks called tokens after which learning the relationships between tokens to predict what words comes next.
These models have gotten excellent at this due to attention mechanism, which uses interconnected layers of information processing nodes to make sense of context by allowing tokens to selectively give attention to, or attend to, related tokens.
But when every token can attend to each other token in a 30-page document, that quickly becomes computationally intractable. So, when engineers construct transformer models, they often employ attention masking techniques which limit the words a token can attend to.
As an illustration, a causal mask only allows words to attend to people who got here before it.
Engineers also use positional encodings to assist the model understand the placement of every word in a sentence, improving performance.
The MIT researchers built a graph-based theoretical framework to explore how these modeling decisions, attention masks and positional encodings, could affect position bias.
“All the things is coupled and tangled inside the attention mechanism, so it is vitally hard to check. Graphs are a versatile language to explain the dependent relationship amongst words inside the attention mechanism and trace them across multiple layers,” Wu says.
Their theoretical evaluation suggested that causal masking gives the model an inherent bias toward the start of an input, even when that bias doesn’t exist in the information.
If the sooner words are relatively unimportant for a sentence’s meaning, causal masking could cause the transformer to pay more attention to its starting anyway.
“While it is usually true that earlier words and later words in a sentence are more necessary, if an LLM is used on a task that shouldn’t be natural language generation, like rating or information retrieval, these biases might be extremely harmful,” Wu says.
As a model grows, with additional layers of attention mechanism, this bias is amplified because earlier parts of the input are used more incessantly within the model’s reasoning process.
In addition they found that using positional encodings to link words more strongly to nearby words can mitigate position bias. The technique refocuses the model’s attention in the appropriate place, but its effect might be diluted in models with more attention layers.
And these design decisions are just one explanation for position bias — some can come from training data the model uses to learn the best way to prioritize words in a sequence.
“In the event you know your data are biased in a certain way, then you need to also finetune your model on top of adjusting your modeling decisions,” Wu says.
Lost in the center
After they’d established a theoretical framework, the researchers performed experiments by which they systematically varied the position of the right answer in text sequences for an information retrieval task.
The experiments showed a “lost-in-the-middle” phenomenon, where retrieval accuracy followed a U-shaped pattern. Models performed best if the appropriate answer was situated initially of the sequence. Performance declined the closer it got to the center before rebounding a bit if the right answer was near the top.
Ultimately, their work suggests that using a unique masking technique, removing extra layers from the eye mechanism, or strategically employing positional encodings could reduce position bias and improve a model’s accuracy.
“By doing a mix of theory and experiments, we were able to have a look at the results of model design decisions that weren’t clear on the time. If you should use a model in high-stakes applications, it’s essential to know when it should work, when it won’t, and why,” Jadbabaie says.
In the long run, the researchers need to further explore the consequences of positional encodings and study how position bias might be strategically exploited in certain applications.
“These researchers offer a rare theoretical lens into the eye mechanism at the guts of the transformer model. They supply a compelling evaluation that clarifies longstanding quirks in transformer behavior, showing that spotlight mechanisms, especially with causal masks, inherently bias models toward the start of sequences. The paper achieves the most effective of each worlds — mathematical clarity paired with insights that reach into the center of real-world systems,” says Amin Saberi, professor and director of the Stanford University Center for Computational Market Design, who was not involved with this work.
This research is supported, partially, by the U.S. Office of Naval Research, the National Science Foundation, and an Alexander von Humboldt Professorship.