Bundle and save: An overlooked factor in the relationship between vocabulary size and listening comprehension

Native speakers of English are able to maximize listening and speaking fluency through heavy use of multiword units and lexical bundles. Through listening instruction that emphasizes these word groups, non-native speakers of English can benefit in the same way, even with a limited vocabulary.
Keywords: adult, college, secondary, listening, vocabulary

Does more vocabulary always equal greater fluency?

Language acquisition presents a host of challenges for the non-native speaker. One of the key challenges in this endeavor is the acquisition of vocabulary. Research has consistently shown that a learner’s vocabulary size is a primary factor in language proficiency. This has been demonstrated so regularly, in fact, that it has become axiomatic that vocabulary size strongly correlates with language proficiency. However, when we examine this claim more closely, we find that it is not necessarily the consistent case across the four primary skills of reading, writing, listening, and speaking. For example, Miralpeix and Muñoz (2018) note, “[Vocabulary] size is more closely connected to writing and reading than to speaking and listening” (p. 16). Li (2019) and Stæhr (2008) come to the same conclusion. For example, in a study involving English language learners in Denmark, Stæhr (2008) found a correlation of 0.83 between vocabulary size and reading scores, but a correlation of only 0.69 between vocabulary and listening scores. Why is it that a factor so strongly associated with proficiency in reading yields such muddled results with proficiency in listening? Miralpeix and Muñoz (2018) and Stæhr (2008) offer the possibility that it might be due to the relative scarcity of research available examining the correlation of vocabulary size and listening comprehension when compared with the abundance of research available examining the correlation of vocabulary size and reading comprehension. Another possibility suggested by these authors is that the form of testing being done by researchers tends to be biased toward the orthographic form of vocabulary rather than the aural form. As Miralpeix and Muñoz (2018) explain, “Interestingly, all these studies on listening comprehension abilities and vocabulary size use tests with written input: that is, students see words in the written form when tested on vocabulary. This may be due to the lack of tools to assess aural lexical recognition” (p. 5). However, the difference is more likely to derive from the very different processes involved with processing reading versus processing listening.

How the brain processes listening vs. reading

According to Stæhr (2008), “the quality of a learner’s listening comprehension is strongly dependent on his ability to cope with the heavy on-line processing demands of understanding spoken input. The learner has to process the incoming stream of speech quickly and automatically” (p. 148). The cognitive demands of real-time processing necessitate that a learner utilize different strategies when listening than they would when reading (Goh, 2000; Li, 2019; Miralpeix & Muñoz, 2018; Vandergrift, 1999). Goh (2000) breaks down the listening task into three stages: perception, parsing, and utilization. As Goh describes,

Perceptual processing is the encoding of the acoustic or written message. In listening, this involves segmenting phonemes from the continuous speech stream… During parsing, words are transformed into a mental representation of the combined meaning of these words… [during utilization] the listener may draw different types of inferences to complete the interpretation and make it more personally meaningful, or use the mental representation to respond to the speaker. (p. 57) 

All of this has to happen during the moment of the listening event. Moreover, in order for the perception and parsing to be successful, “the listener must discriminate between sounds, understand vocabulary and grammatical structures, interpret stress and intonation, retain what was gathered in all of the above, and interpret it within the immediate as well as the larger sociocultural context of the utterance” (Vandergrift, 1999, p. 168). In other words, there is a lot going on, it happens very quickly, and the listener usually only gets one chance to make it happen correctly.

This contrasts quite sharply with the cognitive tasks required to read fluently. When an individual is reading, the vocabulary is represented visually, with the orthography providing clear word and sentence boundaries. Moreover, the reader can move back and forth in the text, while speeding up or slowing down the reading rate in order to facilitate comprehension (Grabe, 2008, 2010). Additionally, as Akinnaso (1982) describes, written language has the qualities of “permanency, surveyability, and (re)organization” (p. 114). In other words, in contrast to spoken language, written language has a “permanent” quality to it, in that it remains fixed on the page. This permanence makes it possible for the writer to revisit the same language product repeatedly, which also makes it possible to organize and reorganize written language in a way that is not possible with spoken language. The end result is often a text that is more carefully worded and organized than spoken language.

Beyond the physical and cognitive differences between processing listening and reading, the two skills differ in the way vocabulary and grammar are used (Drieman, 1962). For example, lack of intonation, gestures, interlocutor feedback, and immediate contextual information in written language necessitates that different vocabulary and grammar forms be utilized in order to compensate for the absence of this information which would otherwise be present in spoken discourse. According to Akinnaso (1982), “Attempts to convey prosodic and contextual information in writing often lead to lexical elaboration and syntactic complexity” (p. 112). The effect is not that written and spoken language use completely different lexis and grammar, rather, as Smith (1978, as cited in Akinnaso, 1982) observes, “they share a common vocabulary and the same grammatical forms – but they are likely to contain different distributions of each” (p. 119). One way that this difference manifests itself in the skill of listening, in terms of vocabulary, is in the use of multiword units and lexical bundles.

How the brain learns to cope with listening demands

Given the intense cognitive load that is demanded of a listener in either an L1 or an L2, the brain needs to find ways to compensate – to create some mental “shortcuts.” Siyanova-Chanturia and Martinez (2014) acknowledge this when they write, “Much of the language we experience on a daily basis is largely ‘formulaic’, or ‘prefabricated’, rather than completely novel and newly assembled on each utterance, word-by-word” (p. 1). This “formulaic” or “prefabricated” language has been described by many different terms in the literature. Wray (2000, as cited in Nation, 2013) provides approximately 50 terms (including collocations, idioms, phrasal expressions, lexical bundles, and multiword units; p. 479). For the purposes of this article, we will divide them into two broad groups: multiword expressions (including collocations, idioms, and phrasal expressions) and lexical bundles. The reason for the delineation is that multiword units appear to encompass units of meaning, whereas lexical bundles appear to encompass functional units in discourse. As Biber and Barbieri (2007) describe, “‘Lexical bundles’, defined simply as the most frequently recurring sequences of words… are usually not structurally complete and not idiomatic in meaning, but they serve important discourse functions in both spoken and written texts” (p. 264). Yet, cognitively, both of these units utilize Sinclair’s “Idiom Principle,” which is: “that a language user has available to him or her a large number of semi-preconstructed phrases that constitute single choices, even though they might appear to be analyzable into segments” (1991, as cited in Siyanova-Chanturia & Martinez, 2014, p. 2).

Multiword units and lexical bundles as vocabulary

Given that multiword units and lexical bundles appear to function essentially as lexis in a language user’s mind, they should represent a significant factor in the vocabulary size discussion, especially since they are more frequent in spoken language (Biber, 2009; Biber & Barbieri, 2007; Nation, 2013; Shin, 2006, 2007). For example, Shin (2007), in an analysis of collocations in a 10-million-running-word spoken corpus and a 10-million-running-word written corpus, found that even though the number of collocations was almost the same (a little over 2,000 for each), “the top 50 spoken collocations occurred almost three times as often as the top 50 written collocations” (p. 208). Similarly, when examining lexical bundles, Biber et al. (2004) found about the exact same ratio when comparing academic speech and text. 

Perhaps this explains why traditional counting of vocabulary size does not correlate as closely with listening comprehension as it does with reading comprehension. In fact, Miralpeix and Muñoz (2018) note that, “although vocabulary size has a considerable influence on proficiency and performance in the four skills in upper-intermediate/advanced learners, the extent of this influence is reduced or restricted when vocabulary size increases” (p. 17). Typically, this restriction is attributed to the decreasing occurrence of mid- and low-frequency vocabulary in spoken and written texts. However, what if something else is going on in the spoken texts? For instance, Shin and Nation (2008) found that many of the most frequent collocations in English comprise words from the most frequent 2000 words in English. Concordantly, if the collocations themselves were considered as individual vocabulary units, they would be highly represented among the most frequent 2000 words. Therefore, it appears that, with regard to spoken language, speakers and listeners are saving memory and cognitive load, and increasing their vocabulary size, by bundling high frequency words. 

Implications for pedagogy

Seeing that multiword units and lexical bundles represent such a significant lexical factor in spoken language, it is logical that they should also represent a more significant factor in both assessment of a learner’s vocabulary size and in language pedagogy. Regarding first language (L1) production, Pawley and Syder (1983) assert, “A minority of spoken clauses are entirely novel creations, in the sense that the combination of lexical items used is new to the speaker” (p. 205), and “memorized clauses and clause-sequences form a high proportion of the fluent stretches of speech heard in everyday conversation” (p. 208). If spoken language is being produced in this way, then it stands to reason that teaching multiword units and lexical bundles to language learners will provide them with the same benefits that an L1 listener enjoys. For example, knowing and understanding lexical bundles and multiword expressions will “enable learners to reduce cognitive effort, to save processing time, and to have language available for immediate use” (Shin & Nation, 2008, p. 340). Moreover, since lexical bundles help to organize discourse, signal stance, and reference other information, increased awareness of and competence with them will enable the listener to utilize listening strategies more effectively (Biber & Barbieri, 2007). However, research has shown that learners tend to either miss them, or incorrectly think they understand them (Nation, 2013), possibly because many of these multiword units and lexical bundles comprise high frequency words. Nevertheless, research has also shown that, with instruction, it is possible for learners to notice, learn, and remember them in ways similar to native speakers (Hernández et al., 2016; Siyanova-Chanturia & Martinez, 2014).

In conclusion, while research on vocabulary size and language proficiency has generally shown a positive correlation for reading, it has been less clear on listening. Many possibilities have been suggested for the weaker connection between vocabulary size and listening proficiency. However, one possibility that doesn’t appear to be discussed much in relation to vocabulary size is knowledge of multiword units and lexical bundles. Since multiword units and lexical bundles function in both the lexical and functional realms of discourse, they are very likely an important factor in the difference between reading and listening skills with regards to vocabulary size. Therefore, they offer a promising avenue for advancement in both future research and language pedagogy.

Additional resources

This article by Shin and Nation includes in the appendix a helpful list of the 100 most frequent spoken collocations in English:

Shin, D., & Nation, P. (2008) Beyond single words: The most frequent collocations in spoken English. ELT Journal, 62(4), 339–348. http://dx.doi.org/10.1093/elt/ccm091

This article by Liu includes in Appendix B a helpful list of the most frequent spoken idioms in American English:

Liu, D. (2003). The most frequently used spoken American English idioms: A corpus analysis and its implications. TESOL Quarterly, 37(4), 671–700. https://doi.org/10.2307/3588217

This website, using the research of Simpson-Vlach & Ellis, includes the 200 most frequent academic formulas used in spoken English: https://www.eapfoundation.com/vocab/academic/afl/

This website is an excellent resource for quickly finding videos that contain the target words or phrases typed in the search box. So, for example, a teacher could search for any of the collocations, idioms, or academic formulas listed in the resources above and immediately find videos that contain the desired language (video results are automatically queued to the relevant portion of the video and videos include highlighted transcripts): https://youglish.com/


Akinnaso, F. N. (1982). On the differences between spoken and written language. Language and Speech, 25(2), 97–125. https://doi.org/10.1177/002383098202500201

Biber, D. (2009). A corpus-driven approach to formulaic language in English. International Journal of Corpus Linguistics, 14(3), 275–311. https://doi.org/10.1075/ijcl.14.3.08bib

Biber, D., & Barbieri, F. (2007). Lexical bundles in university spoken and written registers. English for Specific Purposes, 26(3), 263–286. https://doi.org/10.1016/j.esp.2006.08.003

Biber, D., Conrad, S., & Cortes, V. (2004). If you look at…: Lexical bundles in university teaching and textbooks. Applied Linguistics, 25(3), 371–405. https://doi.org/10.1093/applin/25.3.371

Drieman, G. H. J. (1962). Differences between written and spoken language: An exploratory study. Acta Psychologica, 10, 78–100. https://doi.org/10.1016/0001-6918%2862%2990006-9

Goh, C. C. M. (2000). A cognitive perspective on language learners’ listening comprehension problems. System, 28(1), 55–75. https://doi.org/10.1016%2FS0346-251X%2899%2900060-3

Grabe, W. (2008). Reading in a second language: Moving from theory to practice. Cambridge University Press.

Grabe, W. (2010). Fluency in reading—Thirty-five years later. Reading in a Foreign Language, 22(1), 71–83.

Hernández, M., Costa, A., & Arnon, I. (2016). More than words: Multiword frequency effects in non-native speakers. Language, Cognition and Neuroscience, 31(6), 785–800. http://dx.doi.org/10.1080/23273798.2016.1152389

Li, C. H. (2019). Using a listening vocabulary levels test to explore the effect of vocabulary knowledge on GEPT listening comprehension performance. Language Assessment Quarterly, 16(3), 328–344. https://doi.org/10.1080/15434303.2019.1648474

Miralpeix, I., & Muñoz, C. (2018). Receptive vocabulary size and its relationship to EFL language skills. International Review of Applied Linguistics in Language Teaching, 56(1), 1–24. https://doi.org/10.1515/iral-2017-0016

Nation, I. S. P. (2013). Learning vocabulary in another language (2nd ed.). Cambridge University Press. 

Pawley, A., & Syder, F. H. (1983). Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. In J. C. Richards & R. W. Schmidt (Eds.), Language and communication (pp. 191–225). Longman.

Shin, D. (2006). A collocation inventory for beginners [Unpublished doctoral dissertation]. Victoria University of Wellington, Wellington, New Zealand. 

Shin, D. (2007). The high frequency collocations of spoken and written English. English Teaching, 62(1), 199–218. http://dx.doi.org/10.15858/engtea.62.1.200703.199

Shin, D., & Nation, P. (2008). Beyond single words: The most frequent collocations in spoken English. ELT Journal, 62(4), 339–348. http://dx.doi.org/10.1093/elt/ccm091

Siyanova-Chanturia, A., & Martinez, R. (2014). The idiom principle revisited. Applied Linguistics, 36(5), 549–569. https://doi.org/10.1093/applin/amt054

Smith, F. (1978). Understanding reading: A psycholinguistic analysis of reading and learning to read. Lawrence Erlbaum Associates, Inc., Publishers.

Stæhr, L. S. (2008). Vocabulary size and the skills of listening, reading and writing. Language Learning Journal, 36(2), 139–152. https://doi.org/10.1080/09571730802389975

Vandergrift, L. (1999). Facilitating second language listening comprehension: Acquiring successful strategies. ELT Journal, 53(3), 168–176. https://doi.org/10.1093/elt/53.3.168



Justin Petersen
Justin Petersen is an ELL Instructor at Minnesota State Community…