Experiments reveal that LLM students develop their own understanding of reality as their language skills improve

Language models can develop their own understanding of reality to improve their generative abilities, suggesting that they may one day be able to understand language at a deeper level than they do today. Credit: Alex Shipps/MIT CSAIL

Ask a large language model (LLM) like GPT-4 to smell a rain-soaked campsite, and it will politely decline. Ask the same system to describe that smell to you, and it will wax poetic about “an air full of anticipation” and “a smell both fresh and earthy,” despite having no prior experience with rain or a nose to help it make such observations. One possible explanation for this phenomenon is that the LLM is simply mimicking the text in its vast training data, rather than working with any real understanding of rain or smell.

But does the lack of eyes mean that linguistic models can never “understand” that a lion is “bigger” than a house cat? Philosophers and scientists have long considered the ability to attribute meaning to language to be a hallmark of human intelligence, and have wondered what essential ingredients allow us to do so.

Investigating this conundrum, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have discovered intriguing results suggesting that language models can develop their own understanding of reality to improve their generative capabilities.

The team first developed a set of small Karel puzzles, which involved finding instructions for controlling a robot in a simulated environment. They then trained an LLM on the solutions, but without demonstrating how the solutions actually worked. Finally, using a machine learning technique called “probing,” they examined the model’s “thought process” as it generated new solutions.

After training on more than a million random puzzles, they found that the model spontaneously developed its own conception of the underlying simulation, despite never having been exposed to this reality during training. Such findings challenge our intuitions about the types of information needed to learn linguistic meaning—and whether LLMs could ever understand language at a deeper level than they do today.

“At the beginning of these experiments, the language model generated random instructions that didn’t work. After training was complete, our language model generated correct instructions at a rate of 92.4 percent,” says Charles Jin, an MIT doctoral student in electrical engineering and computer science (EECS) and affiliated with CSAIL, who is the lead author of a new paper on the work.

“This was a very exciting moment for us, because we thought that if your language model could perform a task with this level of accuracy, we could expect it to also understand the meaning of the language. This gave us a starting point for determining whether LLMs do indeed understand text, and we now see that they are capable of much more than just blindly putting words together.”

The article is published on the arXiv preprint server.

Inside the mind of an LLM

The probe allowed Jin to see this progress firsthand. Its role was to interpret what the LLM thought the instructions meant, revealing that the LLM was developing its own internal simulation of how the robot would move in response to each instruction. As the model’s ability to solve puzzles improved, those designs also became more accurate, indicating that the LLM was beginning to understand the instructions. Before long, the model was consistently putting the pieces together correctly to form working instructions.

Jin notes that the LLM’s understanding of language develops in phases, much like a child learns to speak in stages. At first, it’s like a baby babbling: it’s repetitive and mostly unintelligible. Then the language model learns the syntax, or rules of the language. This allows it to generate instructions that may look like real solutions, but still don’t work.

The LLM instructions, however, gradually improve. Once the model makes sense, it begins to produce instructions that correctly implement the requested specifications, like a child forming coherent sentences.

Separating the method from the model: a “weird world”

The probe was only meant to “penetrate an LLM’s brain,” as Jin describes it, but there was a small possibility that it would also do the thinking for the model. The researchers wanted to make sure that their model understood the instructions independently of the probe, rather than the probe inferring the robot’s movements from the LLM’s understanding of syntax.

“Imagine you have a stack of data that encodes the LM’s thinking process,” Jin suggests. “The probe is like a forensic analyst: You hand it this stack of data and say, ‘Here’s how the robot moves, now try to find the robot’s movements in the stack of data.’ The analyst later tells you that they know what’s going on with the robot in the stack of data. But what if the stack of data actually only encodes the raw instructions, and the analyst has found a clever way to extract the instructions and follow them accordingly? In that case, the language model hasn’t really learned what the instructions mean.”

To untangle the roles, the researchers reversed the direction of a new probe’s instructions. In this “weird world,” as Jin calls it, instructions like “up” now meant “down” in the instructions that moved the robot around its grid.

“If the probe translates the instructions into robot positions, it should be able to translate the instructions into the odd meanings just as well,” Jin says. “But if the probe actually finds encodings of the robot’s original movements in the language model’s thought process, then it should have a hard time extracting the robot’s odd movements from the original thought process.”

It turned out that the new probe encountered translation errors, unable to interpret a language model whose instructions had a different meaning. This means that the original semantics were embedded in the language model, indicating that the LLM understood which instructions were needed independently of the original probe classifier.

“This research directly targets a central question in modern artificial intelligence: Are the surprising capabilities of large language models simply due to large-scale statistical correlations, or do large language models develop a meaningful understanding of the reality they are asked to work with? This research indicates that the LLM develops an internal model of the simulated reality, even if it was never trained to develop this model,” says Martin Rinard, MIT professor in the EECS, member of CSAIL and lead author of the paper.

This experiment confirmed the team’s hypothesis that language models can develop deeper understanding of language. However, Jin acknowledges some limitations to their paper: They used a very simple programming language and a relatively small model to gather their insights. In future work, they will look to use a more general framework. While Jin’s latest research doesn’t explain how to make the language model learn meaning faster, he thinks future work can build on this knowledge to improve how language models are trained.

“An open and intriguing question is whether the LLM actually uses its internal model of reality to reason about that reality when solving the robot navigation problem,” Rinard says. “While our results are consistent with the LLM using the model in this way, our experiments are not designed to answer this next question.”

“There’s a lot of debate these days about whether LLMs actually do a good job of ‘understanding’ language, or whether their success can be attributed to what are essentially tricks and heuristics that come from absorbing large volumes of text,” says Ellie Pavlick, an assistant professor of computer science and linguistics at Brown University, who was not involved in the study.

“These questions are at the heart of how we build AI and what we think are the inherent possibilities or limitations of our technology. This is an interesting paper that examines this question in a controlled way: the authors exploit the fact that computer code, like natural language, has both syntax and semantics, but unlike natural language, semantics can be directly observed and manipulated for experimental purposes. The experimental design is elegant and their conclusions are optimistic, suggesting that LLMs may be able to learn something deeper about what language “means.”

More information:
Charles Jin et al., Emergent representations of program semantics in language models trained on programs, arXiv (2023). DOI: 10.48550/arxiv.2305.11169

Journal information:
arXiv

Provided by the Massachusetts Institute of Technology

This article is republished with kind permission from MIT News (web.mit.edu/newsoffice/), a popular site covering the latest research, innovation, and teaching at MIT.

Quote:Experiments reveal that LLM students develop their own understanding of reality as their language skills improve (2024, August 14) retrieved August 14, 2024 from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without written permission. The content is provided for informational purposes only.

Experiments reveal that LLM students develop their own understanding of reality as their language skills improve

Unexplained chronic itch reveals unique blood biomarkers that could eventually lead to new targeted treatments

New Brain-Computer Interface Allows Man With ALS to ‘Speak’ Again

New Brain-Computer Interface Allows Man With ALS to 'Speak' Again

Leave a Reply Cancel reply

Category