Uncovering the Role of Language in Machine Reasoning
by Gregory Barber (Simons Institute science communicator in residence, Spring 2025)
In the training of advanced AI systems like GPT-5 and Claude, human language is king. Large language models (LLMs) like GPT-5 and Claude gain their encyclopedic knowledge and conversational tact by learning from an entire internet’s worth of human-generated text. But learning from language alone has shown diminishing returns. While LLMs have proved themselves to be masters of producing fluent language, their capabilities in other cognitive skills, like logic and reasoning, have lagged behind.
At the Simons Institute workshop on LLMs, Cognitive Science, Linguistics, and Neuroscience last year, neuroscientists, linguists, and computer scientists came together to explore why this is the case — and how a different model, the human brain, could point the way forward. That discussion centered on a fundamental question about humans: In our brains, what is the true role of language across the broad and diverse range of cognitive abilities that we display? The hope is that a better understanding of that role could help create machines that think more clearly.
“Right now is the most exciting time to be doing linguistic neuroscience,” said Laura Gwilliams, a neuroscientist at Stanford University who was one of the workshop speakers — for two major reasons.
The first is the ability to use LLMs as tools to model the human brain. For decades, neuroscientists interested in language have looked enviously at colleagues who study other senses, like vision, for which animals can often serve as a relevant model. Human language has had no such analogues — until LLMs. “For the first time in history, there is a system other than the human brain that is able to exhibit language-like behavior,” she said.
The second factor making this a pivotal moment in linguistic neuroscience research is a revolution in the tools available for studying the brain itself. Older tools commonly used in neuroscience, like EEGs and fMRI scans, have recently been joined by new devices like Neuropixels probes, which record the activity of thousands of individual neurons in a single cortical column. That level of precision has allowed neuroscientists to begin matching linguistic theory — how the mind translates patterns of sound into meaning — to how networks of neurons actually behave.
In one of Gwilliams’s experiments, participants undergoing other medical procedures consented to have Neuropixels probes inserted into their brain while they listened to short stories. By tracking how groups of neurons fired over a fraction of a second, her team was able to observe how certain networks of cells work together and appear to specialize in responding to sound at different levels of the linguistic hierarchy, such as phonemes, syllables, words, and phrases.
Shailee Jain, a postdoc at the UCSF Weill Institute for Neurosciences and another workshop speaker, is using a similar suite of tools, from LLMs to Neuropixels, in an effort to eventually create what she calls a “silicon brain.” Comparing the behavior of LLMs’ neural networks with brain activity allows Jain to perform “in silico” experiments that would be impossible in people. Some of her work involves probing how the brain responds to specific semantic themes present in linguistic data, identifying areas that light up for certain concepts — for example, phrases tied to ideas of “politeness” or references to parts of the body.
But these models also raise bigger questions about where language ends and thought begins — and what that means for building more capable AI systems.
The linguist Noam Chomsky is among the many experts who have equated thinking with linguistic capabilities. “If there is a severe deficit of language,” he once argued, “there will be a severe deficit of thought.”
The findings of Evelina Fedorenko, a workshop speaker and neuroscientist at MIT who studies what she calls the brain’s “language network,” challenge that idea. Many years of neuroimaging studies have found that brain regions responsible for linguistic tasks are activated independently of those used in reasoning, math, or spatial navigation. And contrary to Chomsky’s assumption, studies have found that people can perform complex cognitive tasks even with profound language impairments, she notes. “Language and thought appear to be distinct in the mind and brain,” Fedorenko said.
At the workshop, that assertion turned out to be somewhat controversial. After all, language is the medium through which much of our conscious thought occurs, and words are difficult to disentangle from the experience of thinking. Fedorenko challenged skeptics to give her a clear example of a cognitive task where language is definitively required: “Give me something to test,” she said. Language, she argues, is often a tool we use to improve or develop our thinking, but it is not the same as thought itself.
In LLMs, the boundaries between linguistic prowess and other cognitive abilities often appear to be clearer. Although AI systems have demonstrated the ability to generate fluent text, they often struggle with other tasks, like logical inference or completing basic math problems.
One reason could be an overreliance on massive amounts of linguistic data to produce those types of diverse capabilities. Anna (Anya) Ivanova, a former member of Fedorenko’s lab who is now at Georgia Tech and who spoke at the workshop, has been exploring how the separation of language and thought in the human brain could be used as a model to expand machine intelligence.
Recent models, such as GPT-5, have greatly improved performance across a variety of cognitive skills. But so far, the new strategies to improve AI “reasoning,” like “chain-of-thought” prompting — where an AI system breaks down a prompt into a series of steps — still fundamentally rely on language. That might not be unlike how the human brain often relies on language as a tool to improve thought. But creating AI that thinks will likely require radically different strategies, she argues.
That open domain of research is part of what makes this moment exciting for neuroscientists who are just beginning to explore their new suite of tools to understand brains both human and silicon. “I really don’t understand how some people are not excited about this,” Fedorenko said, though ultimately conceding, “If we all agreed on everything, it would be no fun.”