Cognitive scientists develop latest model explaining difficulty in language comprehension

-

Cognitive scientists have long sought to grasp what makes some sentences harder to understand than others. Any account of language comprehension, researchers imagine, would profit from understanding difficulties in comprehension.

In recent times researchers successfully developed two models explaining two significant forms of difficulty in understanding and producing sentences. While these models successfully predict specific patterns of comprehension difficulties, their predictions are limited and do not fully match results from behavioral experiments. Furthermore, until recently researchers couldn’t integrate these two models right into a coherent account.

A latest study led by researchers from MIT’s Department of Brain and Cognitive Sciences (BCS) now provides such a unified account for difficulties in language comprehension. Constructing on recent advances in machine learning, the researchers developed a model that higher predicts the convenience, or lack thereof, with which individuals produce and comprehend sentences. They recently published their findings within the .

The senior authors of the paper are BCS professors Roger Levy and Edward (Ted) Gibson. The lead writer is Levy and Gibson’s former visiting student, Michael Hahn, now a professor at Saarland University. The second writer is Richard Futrell, one other former student of Levy and Gibson who’s now a professor on the University of California at Irvine.

“This isn’t only a scaled-up version of the prevailing accounts for comprehension difficulties,” says Gibson; “we provide a latest underlying theoretical approach that enables for higher predictions.”

The researchers built on the 2 existing models to create a unified theoretical account of comprehension difficulty. Each of those older models identifies a definite wrongdoer for frustrated comprehension: difficulty in expectation and difficulty in memory retrieval. We experience difficulty in expectation when a sentence doesn’t easily allow us to anticipate its upcoming words. We experience difficulty in memory retrieval when now we have a tough time tracking a sentence featuring a posh structure of embedded clauses, akin to: “The undeniable fact that the doctor who the lawyer distrusted annoyed the patient was surprising.”

In 2020, Futrell first devised a theory unifying these two models. He argued that limits in memory don’t affect only retrieval in sentences with embedded clauses but plague all language comprehension; our memory limitations don’t allow us to perfectly represent sentence contexts during language comprehension more generally.

Thus, based on this unified model, memory constraints can create a latest source of difficulty in anticipation. We are able to have difficulty anticipating an upcoming word in a sentence even when the word must be easily predictable from context — in case that the sentence context itself is difficult to carry in memory. Consider, for instance, a sentence starting with the words “Bob threw the trash…” we are able to easily anticipate the ultimate word — “out.” But when the sentence context preceding the ultimate word is more complex, difficulties in expectation arise: “Bob threw the old trash that had been sitting within the kitchen for several days [out].”
 
Researchers quantify comprehension difficulty by measuring the time it takes readers to answer different comprehension tasks. The longer the response time, the tougher the comprehension of a given sentence. Results from prior experiments showed that Futrell’s unified account predicted readers’ comprehension difficulties higher than the 2 older models. But his model didn’t discover which parts of the sentence we are likely to forget — and the way exactly this failure in memory retrieval obfuscates comprehension.

Hahn’s latest study fills in these gaps. In the brand new paper, the cognitive scientists from MIT joined Futrell to propose an augmented model grounded in a latest coherent theoretical framework. The brand new model identifies and corrects missing elements in Futrell’s unified account and provides latest fine-tuned predictions that higher match results from empirical experiments.

As in Futrell’s original model, the researchers begin with the concept our mind, as a result of memory limitations, doesn’t perfectly represent the sentences we encounter. But to this they add the theoretical principle of cognitive efficiency. They propose that the mind tends to deploy its limited memory resources in a way that optimizes its ability to accurately predict latest word inputs in sentences.

This notion results in several empirical predictions. In response to one key prediction, readers compensate for his or her imperfect memory representations by counting on their knowledge of the statistical co-occurrences of words to be able to implicitly reconstruct the sentences they read of their minds. Sentences that include rarer words and phrases are due to this fact harder to recollect perfectly, making it harder to anticipate upcoming words. Because of this, such sentences are generally tougher to understand.

To guage whether this prediction matches our linguistic behavior, the researchers utilized GPT-2, an AI natural language tool based on neural network modeling. This machine learning tool, first made public in 2019, allowed the researchers to check the model on large-scale text data in a way that wasn’t possible before. But GPT-2’s powerful language modeling capability also created an issue: In contrast to humans, GPT-2’s immaculate memory perfectly represents all of the words in even very long and complicated texts that it processes. To more accurately characterize human language comprehension, the researchers added a component that simulates human-like limitations on memory resources — as in Futrell’s original model — and used machine learning techniques to optimize how those resources are used — as of their latest proposed model. The resulting model preserves GPT-2’s ability to accurately predict words more often than not, but shows human-like breakdowns in cases of sentences with rare combos of words and phrases.

“This is a superb illustration of how modern tools of machine learning might help develop cognitive theory and our understanding of how the mind works,” says Gibson. “We couldn’t have conducted this research here even a number of years ago.”

The researchers fed the machine learning model a set of sentences with complex embedded clauses akin to, “The report that the doctor who the lawyer distrusted annoyed the patient was surprising.” The researchers then took these sentences and replaced their opening nouns — “report” in the instance above — with other nouns, each with their very own probability to occur with a following clause or not. Some nouns made the sentences to which they were slotted easier for the AI program to “comprehend.” As an illustration, the model was in a position to more accurately predict how these sentences end once they began with the common phrasing “The undeniable fact that” than once they began with the rarer phrasing “The report that.”

The researchers then got down to corroborate the AI-based results by conducting experiments with participants who read similar sentences. Their response times to the comprehension tasks were much like that of the model’s predictions. “When the sentences begin with the words ’report that,’ people tended to recollect the sentence in a distorted way,” says Gibson. The rare phrasing further constrained their memory and, in consequence, constrained their comprehension.

These results demonstrates that the brand new model out-rivals existing models in predicting how humans process language.

One other advantage the model demonstrates is its ability to supply various predictions from language to language. “Prior models knew to clarify why certain language structures, like sentences with embedded clauses, could also be generally harder to work with inside the constraints of memory, but our latest model can explain why the identical constraints behave otherwise in several languages,” says Levy. “Sentences with center-embedded clauses, as an example, appear to be easier for native German speakers than native English speakers, since German speakers are used to reading sentences where subordinate clauses push the verb to the top of the sentence.”

In response to Levy, further research on the model is required to discover causes of inaccurate sentence representation apart from embedded clauses. “There are other forms of ‘confusions’ that we’d like to check.” Concurrently, Hahn adds, “the model may predict other ‘confusions’ which no person has even considered. We’re now trying to seek out those and see whether or not they affect human comprehension as predicted.”

One other query for future studies is whether or not the brand new model will result in a rethinking of an extended line of research specializing in the difficulties of sentence integration: “Many researchers have emphasized difficulties regarding the method through which we reconstruct language structures in our minds,” says Levy. “The brand new model possibly shows that the issue relates to not the strategy of mental reconstruction of those sentences, but to maintaining the mental representation once they’re already constructed. An enormous query is whether or not or not these are two separate things.”

A method or one other, adds Gibson, “this sort of work marks the long run of research on these questions.”

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

1 COMMENT

0 0 votes
Article Rating
guest
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

1
0
Would love your thoughts, please comment.x
()
x