How Neuroscience and VR Will Open a Better Way to Learn Languages

Two Things Wrong With Rosetta Stone

In another essay, I mentioned Rosetta Stone and its unique approach to language learning. The Rosetta team hired neuroscientists, psychologists, linguists, and educators to find out how best to teach language. They looked at how language is taught in schools, and found it ineffective and laborious. In contrast, when we are babies first learning to speak, we don’t read textbooks, translate articles, study flash cards, repeat phrases over and over again, or take tests. Rather, we learn in an immersive context mostly through trial-and-error and real-world associations. Rosetta Stone saw some wisdom in this “natural” method, and created software that mimicked the way we first learn language as kids.

They decided (with good reason) that text is fairly inefficient and thus eliminated all text in the native language. Instead, the software focuses on associating pictures with the sound and mouthfeel of certain words. This has proven to be an effective technique, as many Rosetta Stone users enthusiastically attest. However, many also cite complaints about the program’s effectiveness. So where did they go wrong?

Well, I believe there are two things wrong with their core assumptions:

The brain’s ability to reorganize and form new connections is called neuroplasticity

The brain’s ability to reorganize and form new connections is called neuroplasticity

First, our adult brains are simply not as plastic (adaptive) as they were when we were kids. Children have incredibly moldable brains that don’t have as many “hard-wired” connections. Their brains can process a ton of stimuli and dedicate fertile brain areas to learning new things, whereas adult brains have formed relatively specific circuitry and cannot form brand new connections as easily. There’s some truth to the adage, “You can’t teach an old dog new tricks.” Rosetta Stone mimics childhood language learning even to the extent that it encourages contextual trial-and-error. For instance, if you get a word/sentence structure wrong, you cannot look up the explanation or grammatical rule; you must simply move forward and rely on context/your own intuition to determine the correct answer later.

The second deficiency is that Rosetta Stone’s approach does not go far enough. There is one method that consistently and reliably teaches language — and that is immersion. Rosetta Stone simply is not immersive enough — but it’s not their fault. Cultural immersion usually requires one’s being physically present in a new location. It’s hard to be immersed in a new language when you’re sitting stationary at a desk, looking at a 2D screen, moving nothing but your mouth and a little bit your eyes and fingers. The software tries to immerse learners in a digital environment but the delivery interface is simply too poor to convince our brains. After all, immersion means being fully engaged with the language. And that means our bodies as well as our minds.

Neuroscience research reveals a better way

The way we process words goes beyond the definition; we understand words based on our experiences with the meaning of those words. It is not enough to sit still and think; our brains are wired to understand concepts more interactively. Simply put, experiencing words boosts comprehension.

Brain researcher Friedemann Pulvermüller and his team have spent years studying the relationship between language and action centers in the brain. They imaged people’s brains in response to action words and found something very intriguing:

“Leg areas of the motor system turned on when people heard words such as ‘kick’; arm and hand areas were activated by words like ‘pick’. Action words related to the face, like ‘lick’, activated brain areas involved in controlling tongue movements.” -Dr. Sian Beilock

When people heard action words, their language pathways started processing the words — but so did their motor cortex. Pulvermüller’s research shows that the brain areas used to move are also used to understand action language.

“Brain areas involved in language processing. Action words related to the face, arm and leg, such as ‘lick’, ‘pick’ and ‘kick’, activate the motor system in a somatotopic manner, together with specific foci in temporal cortex” (Berthier & P…

Brain areas involved in language processing. Action words related to the face, arm and leg, such as ‘lick’, ‘pick’ and ‘kick’, activate the motor system in a somatotopic manner, together with specific foci in temporal cortex” (Berthier & Pulvermüller, 2011).

This is even more meaningful when we take into account Hebb’s rule: “neurons that fire together, wire together.” In other words, neurons that are activated together build stronger connections that help them co-activate in the future. Psychologist Sian Beilock writes,

“When actions and words (or even phrases) are repeatedly paired together, you can’t help but trigger one when the other occurs. And the more actions and words mingle together, the more fluent and deep is our understanding of language.”

Not only does language processing trigger motor cortex activation, but their co-activity actually deepens understanding because we recall our experiences performing those actions. For instance, when we read/hear the word “sprint”, we understand it not just by its definition, but because we are actually pulling from our past experiences, we are recalling the feeling of sprinting. In this way, football players internalize the word “tackle” differently than those of us who have not had that experience, or who have experienced it differently (e.g. soccer players).

This implies that there is some sort of two-way street, a mutualistic relationship between action and language. In that case, damaging one system would also impair the other. Case in point: it has repeatedly been found that disruptions in the motor system, such as with Lou Gehrig’s disease, impair one’s understanding of language, particularly verbs.

Well if this is true, then the converse might also hold: performing movements and actions should improve our grasp of language. This is actually a very familiar idea, and one that we inherently know to be true — the idea of “learning by doing.”

This means we can use action-based lessons to create really effective language learning!

But how do we know this would actually work? Because it is exactly what has been observed. As early as 30 years ago, researchers saw that pantomiming supported recall and recognition of action words later on (Engelkamp and Zimmer, 1985). Studies on children have also shown that acting out the stories they read boosts memory and comprehension. Now we are seeing real evidence from imaging inside the brain that backs up these behavioral phenomena.

Pulvermüller really put this idea to the test in his follow up work on stroke patients, called action therapy. He hypothesized that stimulating motor areas could improve language damage, or aphasia, that occurs in ~1/3 of stroke victims. Treatment for these language issues is limited, and does not always work, resulting in chronic problems that persist for years. Fortunately, Pulvermüller found a way to leverage the brain’s support networks to alleviate this issue: while rehoning their language skills, his patients act out the words. This is exhibiting remarkable success, even helping patients who have had chronic aphasia for years. Pulvermüller’s therapy definitively associates the brain’s action and language centers, and underscores their combined importance for understanding meaning.

There is hard evidence for the effectiveness of activity-based language learning. And people are starting to paying attention. Recent studies are showing that actions enhance foreign language learning as well, even for abstract words (Macedonia & von Kriegstein published a comprehensive review of their work in 2012, simply titled “Gestures enhance foreign language learning.”) Of course, why and how the brain does this is still a debated topic with much research ahead.

So why are we not doing this yet? What are we waiting for?

Enter virtual reality.

Now that VR tech has improved, it is clear that it there is no better medium (except real immersion) through which to experience interactive language education — it’s a natural fit. Language learning is most effective as an immersive experience, something VR developers are obsessed with creating (you always hear about “immersive VR”, which headset is more immersive than the other). Rosetta Stone took a meaningful step in eliminating native text and encouraging people to associate words with pictures and sounds. A VR language app can go further. By creating a language learning program that takes advantage of the immersive capabilities of VR, we can accelerate the pace at which we learn languages, and fundamentally communicate (because isn’t that what learning languages is all about?).

In VR language learning environments, people will be able to practice movements realistically and associate them with certain words. The benefits of VR are not limited to kinesthetic-based learning, however. VR enables users to see 3-dimensional objects spatially organized around them, moving in a coherent scene. It allows people to hear 3D audio, and feel objects in the environment. The full experience brings spatial understanding, detailed scenes, movements, associations, triggers, memories, and retentions. Language learning in this context will let us experience concepts and associate them with words, building more effective fluency and understanding.

Imagine learning how to order food by seeing yourself in a restaurant in Marseille, with a menu in front of you, waiters milling about, and people chattering around you. What if you learned how to talk about sports by feeling the relative size and weight of a basketball, football, hockey puck, bow and arrow, and associating those feelings with the sound and mouthfeel of the corresponding words?

Like Pulvermüller’s action therapy, virtual reality can help us interact with and experience language in a much fuller way than can the typical classroom. It can leverage the full range of our sensory spectrum to teach many new things, not just languages: math, physics, geography, history (more on these later). Arguably, experiential learning through virtual reality is especially meaningful for language, because language is the primary filter through which we perceive and communicate all concepts.

The Outlook

Some researchers (12) are already beginning to do studies in VR to see how perception, memory, and recall of foreign words are affected (and finding good results). The answer to IF virtual language learning can help us is an obvious yes. Now we must research HOW we process language with VR so we can learn how to optimize VR tools and make them even more effective. By studying the neuroscience of experiential learning in the specific context of language and VR, we can vastly accelerate the rate at which we learn languages.

House of Languages demo

There are even a couple early product attempts aimed at making language learning in VR a reality: Lingoland and House of Languages. It’s early days yet, but further iterations will just get better and better. We will start to learn how to fully take advantage of VR environments and how they can be optimized in the context of language. Of course, at the end of the day, language learning takes a lot of work. Without a real commitment to learning, even the best tool cannot help.

Experiential learning is incredibly valuable, and has applications in so many different areas. In math, studies have shown that kids grasp arithmetic concepts better when they see numbers as representations of real objects, such as toys or figurines. In physics, fMRI studies have shown that when students physically experience the concepts of inertia and angular momentum, they later activate their motor cortex just by thinking of those concepts.

This all enforces the broader idea that engaging more of our sensory spectrum, by co-activating the respective brain areas, results in deeper learning and memory. This is because the same neural structures used to process sensory information are also active when processing words/concepts that embody that sensory information. For instance, when we read the word “red,” our brain processes the word as well as the actual color (Martin 1995). Learning recruits multiple brain areas and leveraging this phenomenon is crucial to designing better interfaces.

Designing tools that engage more of our sensory spectrum will expedite and deepen learning and understanding. And if we can improve the rate and depth at which we learn, we can improve how we grow, create, and evolve.


For an in-depth resource on experiential learning, check out How the Body Knows Its Mind, by Sian Beilock, as well as the referenced papers linked above.