Is it better to use pictures or words when learning languages?

The Rosetta Stone people are making a killing through their concept of “natural” language learning. That is, their angle is that with their product, you supposedly learn a new language in the same way you learned your first, which allegedly makes the process easier.

To accomplish this, they use pictures. So you see and hear a foreign word, and a collection of pictures, and you pick the one you think the word represents.

This makes nice, intuitive sense, although if you were skeptical you might think that this method is simply an easier way to increase the size of your product line, since pictures are universal while using words would basically mean re-writing the whole thing for each country you’re selling in. So it’d be good to see a few tests of this learning method.

You’d certainly expect pictures to be more effective, but results have been mixed. Shana K. Carpenter and Kellie M. Olson devised a few studies to tease out the answers.

A Good Old-Fashioned RCT

Using Swahili as the target language, they first did a standard randomised controlled trial. Half were shown a Swahili word with an English word, while the other half got a picture and the Swahili word, probably something like this:

kolb (Photo Credit:

(Just to clarify, “kolb” is only the relevant part. Don’t run up to four legged animals in Kenya shouting “Here Photo Credit! Heeere Photo Credit!!”)

The results did not indicate a difference in the words learned by the participants — pictures were no more effective than words. Why could this be? One reason might be that the picture wasn’t encoded into memory very well. To test this, participants were also asked to free-recall as many pictures (or English words) from the test as they could. People who were presented images rather than words remembered significantly more items. This indicated that the lack of benefit from using pictures was not caused by insufficient encoding of the picture.

So if the pictures themselves are easier to recall than plain words, why weren’t their paired Swahili words easier to remember too? On to the second experiment…

The Multi-Media Heuristic

A heuristic is a basic rule of thumb that the brain uses to save time when processing. Think of it like a stereotype — to conserve the energy that would be spent taking people as they are, it’s easier to assume people possess characteristics associated to groups they belong to. There’s probably a survival thing going on here, since in life-or-death situations you need to respond quickly, so we have a built-in time saving “automatic” reasoning system.

The multimedia heuristic is the assumption people have that text combined with images is easier to remember than text alone. Seems like a reasonable rule of thumb, yet evidence doesn’t support it. Maybe when people see the picture with the foreign word, the energy-conserving multimedia heuristic kicks in and the brain allocates less resources to processing and encoding that word. Why bother with the effort? It’s got a picture with it!

So the test was repeated, but this time participants were asked, for each item, if they thought they’d remember it in five minutes. This test was repeated three times. In the first test, the pictures group was overconfident, and as before there was no difference in performance between groups. However, in the second test both groups saw a dramatic reduction in confidence (perhaps after seeing the results of the first test), and the pictures group did indeed recall more words than the words-only group! The same was found in the third test.

So it works! Perhaps by removing their overconfidence, the multimedia heuristic was assuaged and the brain provided more resources to the learning.

Don’t be overconfident!

In the third test, participants were split into two groups, each of which were tested on both picture-Swahili word and English word-Swahili word combinations. However, one group was given a little message telling them not to be too overconfident:

People are typically overconfident in how well they know something. For example, people might say that they are 50% confident that they will remember a Swahili word, but later on the test, they only remember 20% of those words. It is very important that you try to NOT be overconfident. When you see a Swahili word, try very hard to learn it as best you can. Even if it feels like the word will be easy to remember, do not assume that it will be. When you see a Swahili word with a picture, try your best to link the Swahili word to that picture. When you see a Swahili word with an English translation, try your best to link the Swahili word to that English translation.

Confidence was tested in the same way as the previous test, and indeed, the group receiving the warning reported lower confidence. Did this affect results?

Of course! People receiving the warning performed better than people who didn’t on the picture task, but not on the words task. Since overconfidence is not an issue when remembering word pairs, this both implicates the multimedia heuristic and suggests a way to improve learning of second language words — don’t be overconfident!

Maybe you can now remember the Swahili word for dog, presented earlier? It’s “kolb.” If you’re thinking “Photo Credit” I apologise profusely!


Carpenter, S., & Olson, K. (2012). Are pictures good for learning new vocabulary in a foreign language? Only if you think they are not. Journal of Experimental Psychology: Learning, Memory, and Cognition, 38 (1), 92-101 DOI: 10.1037/a0024828

Open listening: a way to improve spoken language comprehension

One huge frustration I have with learning Spanish — and I understand I’m not alone on this — is missing loads of what’s being said while translating one particular word.

While listening to a dialogue, my attention latches on to words I recognise and I try to retrieve the English translation. But before I find the English word, the speaker is three sentences away and talking about something else.

This is probably a consequence of the way we tend to learn second languages — that is, using our first language as a useful intermediate between a new foreign word and a meaning we already know. But it can be a detriment in comprehension, especially in the earlier stages of learning a language, when listening is far more of a conscious process.

This has nothing to do with the topic. I just find it funny. (Credit: Elephi Pelephi)

Generally I think conscious translating is a mistake. There are times where it’s OK to do this, such as when there’s a gap in the conversation, but I find it’s best to stay focused on what’s being said, not to “zoom in” on any particular word.

I’m hardly an expert and I don’t know what the more linguistically talented might think, but that’s my opinion. Just let go of the words you 50% understand, and keep listening.

The Cohort Model of spoken language comprehension, first proposed by Marslen-Wilson and Welsh (1978), might explain why this works:

“According to this theory, the first few phonemes of a spoken word activate a set or cohort of word candidates that are consistent with that input. These candidates compete with one another for activation. As more acoustic input is analyzed, candidates that are no longer consistent with the input drop out of the set. This process
continues until only one word candidate matches the input; the best fitting word may be chosen if no single candidate is a clear winner.” (ref)

Here’s what happens, according to the cohort model. You hear a Spanish word, say “beber” meaning “to drink.” It sounds familiar but you don’t immediately get the meaning. So you try to translate it, probably rolling your eyes upwards as you do so. Behind the scenes, your brain is creating a cohort of possibilities as to what the word was. Maybe it creates a shortlist of Spanish words starting with “b,” plus a few others that rhyme, and looks up their associated meaning.

Perhaps the reason you stop and put some conscious effort into translating this word, is that you intuitively feel that this is a serial process, where the brain translates words one-by-one, and either gets the meaning or loses it forever — but it is not. The brain does not stop searching for the meaning of an unknown word even though it continues to listen to other words — in fact, it actually uses the input from future words to help filter down to the correct meaning of previously heard words, presumably while they are held in the phonological loop.

Have you every thought you understood what someone said, only to realise you misheard it based on something they said later? You could also deduce, then, that the brain doesn’t even have a concept of a correct word, and is always feeding back data based on probabilities; what it thinks is the most probable meaning.

So to continue the example, if you continued to listen to the speaker instead of temporarily disengaging your attention to consciously translate “beber,” you might hear “cerveza,” the Spanish word for beer and put two-and-two together. The meaning of the previous word comes to you in a flash.

Open Monitoring/Listening

This method of listening is very similar to a type of meditation called open monitoring. In this you sit and just allow any thought or perception to pass through your consciousness, being fully observant of it but not holding your attention on it.

Likewise, in open listening, as it could be called, you focus on the entirety of what is being said, rather than trying to follow the dialog word by word. By not focusing on a single word, you devote more of your attentional capacity to collecting more input.

You might also reason that the more practice one has with open monitoring meditation, the better they should be at language comprehension.

If you speak a second language let me know if you found the same when you were learning. Also, if you meditate a lot, let me know how you find language learning, or comprehending people in even your native language. Do you seem to find it easier than others to understand people with strange accents? Has this improved after your meditation experiences?

The phonological loop and language comprehension

I’ve been digging into research papers again, looking for ways to enhance second language acquisition. After working through a few papers and an introductory text book, I was left thinking “When are they going to get to the part about how to learn languages better?”  Most of the research seems to be on the processes and issues surrounding second-language acquisition, rather than how to enhance it. So I went right back to the drawing board and started looking at the cognitive processes involves in language comprehension and production, looking for clues. I started with the phonological loop.

The phonological loop is the aspect of working memory that deals with auditory input. Loads of studies have been done on this which I won’t go into here, but one hugely important point is that the loop’s capacity is time-based. For example, memory of two syllable words is worse if they are longer (voodoo, harpoon) than if they are shorter (bishop).

Something I’m particularly interested in, is improving language comprehension. I often find that I can listen to something in Spanish, not understand it at all, but upon reading a transcript realise that I know all those words. I think the time-based limitation of the phonological loop might be a key to improving this.

If a person is speaking to you in a foreign language, you often find that although you’re focused on what they are saying, nothing makes sense. Then, they pause for a moment, and the meaning of the last few seconds of speech magically comes to you. It’s as though the brain is occupied with attending to the incoming speech, then when there’s a pause it takes the chance to process whatever’s in the phonological loop. Assuming this is true, there are a few things that might help with language comprehension:

1 – Increase the capacity of the phonological loop

The bigger this is, the more you can hear and keep in your working memory until the pause comes. I couldn’t find any research on actually improving the size of the loop’s capacity, but I’ll keep looking. Let me know if you know of any. Something to consider here is the difference between actual gains in the loop’s size versus the use of strategies to more efficiently store information within it (chunking, etc) — and how and if it is possible to distinguish between the two.

This is important because if capacity could seemingly be improved on some task via a strategy, but that strategy doesn’t transfer to languages, it’s a false friend, so to speak.

Presumably, tasks specific to comprehension would improve the capacity of the loop, that’s assuming that it can be improved at all. Maybe things like the n-back task, set to audio only, or simply listening to short speech clips and trying to repeat it back straight afterwards (ideally in the target language), could be good exercises.

2 – Ability to pay attention

I’ve noted previously that attention is key to memory, and it’s pretty clear how attention fits into this model. If you keep focused on what they are saying you’ll get more of what they say into your phonological loop for later processing. Then when the conversational pause comes, your brain has more data to transform into something meaningful.

3 – Speed of processing

Note that this only applies to words you don’t know “automatically.” If they said “Gracias” or “Merci,” or something in your native language, the meaning would come to you. So at the most basic level, there’s an issue of learning the language well and knowing the words well, preferably without having to do “real-time” translations via your native language. The more of the data in the PL that you just “know,” the more time your brain has to work on the bits it doesn’t already know, and also, it can use the words it does know to narrow down the range of possibilities (more on that later).

Phonological loop in language acquisition

Just to turn everything I’ve just said upside-down, another consideration with the phonological loop is that it’s not solely applied to language comprehension — in fact, it may not even be critical for that (except perhaps in the cases I’ve noted above, during the initial stages of learning a language where listening is an active — and draining — task!).  It is strongly implicated in our ability to learn languages. For example, Gathercole and Baddeley (1989) found that a children’s performance on a word-repeating task predicted their vocabulary a year down the line. So we must be careful here not to confuse correlation with causality, because, for instance, we don’t know if changing our ability on word repeating tasks would improve our language acquisition abilities. It does seem more reasonable that this would improve our ability to comprehend language, however, for the reasons I described above.

Bilinguals perform better in the false belief task

Anything you do for an extended period of time has neurological and cognitive effects. Speaking another language is one thing that seems to have a wide range of effects, one of which being performance in tasks involving reasoning about other people’s beliefs, such as the false belief task.

The False Belief Task

The false belief task had usually been applied to samples of children (and you’ll soon understand why), but Rubio-Fernández and Glucksberg (2011) of UCL and Princeton applied it to a sample of adults — after a few modifications

The task involves a puppet show (told you), where two puppets, Sally and Anne are playing with a toy. Then then put the toy in a box, and Anne leaves the scene. While Anne is away, Sally puts the toy in a different box before she returns. When Anne does get back, the participants are asked where she will look for the toy.

Monolingual children start getting this right at about age four on average. It’s part of the idea of a “Theory of Mind,” where you adopt the belief that other people have a mind just like yours, but separate, and with different knowledge to  you have they will take different actions.

However, bilingual children are better at this task, correctly guessing that Anne will look in the box where she saw it last, as opposed to where they saw Sally place it, from around age three.

The idea is that because bilingual kids have experience talking to people in one language and receiving blank stares, they learn earlier that other people have a separate mind to their own. Which is also in line with the idea that the theory of mind is just a social construct, something that people “figure out” as opposed to a module that develops.

The adult version (not what you’re thinking…)

The False Belief task. Ecological validity?

Although everybody loves a good puppet show, you might see a difficulty in applying this same task to adult bilinguals – adults are all going to answer the task correctly, regardless of their lingual status. So the researchers added an eye tracking element to the test.

Rather than using the participant’s guess as to the location of the toy as the dependent variable, they used eye movements – did the participants first look at the box where the toy actually was (using their own knowledge) before looking at the original box (reasoning about other people’s beliefs)?

Oh, and the puppet show was replaced with a cartoon on a computer (I know, I was disappointed too).

How did the adults do?

As with the children, the adult bilinguals out-performed their monolingual peers. Comparisons of gaze directions between the two groups just sneaked under the holy .05 significance level: X2(1, N = 45) = 3.94, p < .048. So did the fixation latency – the time take to focus the gaze on the correct box – at t(44) = 2.07, p < .045, as you might expect given that more monolinguals looked in the wrong place first.

The Simon Task

The researchers also used another test – the Simon Task. In this, a key is assigned for “LEFT” and another for “RIGHT.” the words LEFT and RIGHT flash up on the screen, and you have to press the right key. Only, sometimes, “RIGHT” appears on the left of the screen, and vice-versa. If you want to give it a go, you can play a Java version here.

Success in this game relies on overriding your natural instinct to press the button on the right, even when the game tries to trick you into doing so. This is called “Executive Control” in cognitive psychology, after the catch-all term “Central Executive” which is used to describe pretty much anything we don’t understand yet. 🙂 See this article on working memory for more information.

As with the False Belief test, the bilinguals did better here too. Why would this be? It’s thought to be because this sort of executive control is old hat to bilinguals. They have to suppress the other language while speaking and thinking, and this transfers to other tasks involving executive control.

Combining the Two

The paper also reports a correlation between performance on the Simon task and performance on the False Belief task – so presumably, the same cognitive ability is involved in both tasks, and bilingualism is the cause of this improved ability – or something that goes hand-in-hand with bilingualism, at the least.

What’s a bilingual?

Bilinguals performed better, but at what point in second language acquisition does this effect occur? The authors note that “all participants were to some extent familiar with a second language.” So that includes people designated as monolingual. The actual criteria they used was:

  1. Learned the language before age 9
  2. Used it regularly for over 10 years

However, the bulk of the group achieved bilingual status much sooner, with a self-reported mean acquisition age of 3. The extent of foreign language familiarity of the monolinguals was not reported. It could be a few years at school, it could be they bought a Spanish CD and listened to it twice.

Presumably, this effect would occur no matter when in life the second language was acquired. This fits with the executive control explanation. I’m not sure how to explain these results without it, but if they did the same test on people who acquired language two after, say, age 30, and didn’t get the same results, it might be interesting to try to reconcile the two.


This is one of many studies demonstrating cognitive differences between bilinguals and monolinguals. Low ecological validity, and only marginally significant results (probably due to the fairly low sample size), also necessarily a quasi-experiment (non-random group assignment by default). However, the results are in line with a lot of other evidence – although bilinguals perform worse on some tasks, on this one they seem to do better.


Rubio-Fernández P, & Glucksberg S (2011). Reasoning about other people’s beliefs: Bilinguals have an advantage. Journal of experimental psychology. Learning, memory, and cognition PMID: 21875251