Talk vs. Speak: A Corpus Investigation

By Sarah Abdelbary




Often, linguists identify synonyms through substitution and interchangeability, while others rely on the difference in meaning between the two synonyms. However, such methods are not enough to provide valid qualitative evidence in a set of data like a corpus investigation. Therefore, the purpose of this corpus investigation is to highlight the differences between and use shared by the two apparent synonyms, talk and speak. For instance, unlike speak and speaks, talk and talks could be nouns. Figures 1 and 2 both illustrate examples of talk and talks as nouns.


Figure1: talk in the BNC













Figure 2: talks in the BNC











This investigation relies fully on the British National Corpus (BNC) in order to produce frequency lists and generate concordances of the two words, talk and speak. All verb forms of talk and speak were searched using the BNC corpus. In addition, Microsoft Excel spreadsheets were used to organize the obtained data in charts and tables to ease the explanation process.


Results and Discussions

The first corpus technique this investigation considers is frequency list analysis. The word, talk, when inserted into the software search engine, provided a result of 28,862 tokens in comparison to 24,520 tokens for the word, speak. This result suggests a hypothesis that the word talk is more frequently used than speak. However, at this stage of the investigation, it is too early to determine the validity of this hypothesis. Yet, with more data, this hypothesis could be explored further.


First, when the target item talk* was searched, all words beginning with these letters appeared in Figure 3 below, which contains talk, talking, talks, and talked.

Figure 3: talk* in the BNC










Now, we move to the second word, speak. After inserting the word into the BNC software, a long list, similar to Figure 4, with a variety of forms including speaker, speaking, and speakers appeared on screen.


Figure 4: speak* in the BNC










However, the most frequent form among them all is talk. Similarly, according to the lists and results obtained from the BNC, the word, speak appeared to be the most frequent form among the rest. Figure 5 below shows in detail the frequency of each form according to the British national Corpus.


Figure 5: talk* and speak* freq in the BNC








Also, it has been noted that both: talk and speak seem particularly common in fiction and spoken language domains. Talk appears 333.45 times per million in the fiction genre, and 315.25 times per million in spoken language. Speak on the other hand appears 202.52 times per million in fiction, and 132.68 times per million in spoken language. These results suggest a new hypothesis that the two words are almost identical and can substitute each other without a change in the meaning. In other words, talk and speak form an almost identical synonymous pair. This hypothesis could be explored further more through looking at the concordances of both words.


The second corpus technique this investigation considers is the generation of concordances. Concordancing is a valuable technique because it provides examples of items in their original contexts. Hence, making it easier to generate hypotheses and test them. Figure 6 below indicates what precedes and follows talk and speak in detail.


Figure 6: talk & speak in the BNC









According to the BNC, talk is preceded 35 times by the preposition “to” as Figure 7 below indicates.


Figure 7: talk in the BNC







Moreover, Figure 7 above also shows that talk tends to be followed by the same preposition, to, which appeared 29 times, to indicate the person taking part in the discourse. However, there are other prepositions that often collocate with the talk including about, with a total of 26, used to introduce the topic, and of with a total of 5 all shown in Figure 8.  


Figure 8: talk in the BNC








Similarly, as shown in Figure 9, the word speak appears to be frequently preceded by the preposition to with a total of 41 to introduce the verb of speaking.


Figure 9: speak in the BNC








Moreover, with only some minor difference from the previous conducted results for the word talk, it has been noted that speak as well is frequently followed by prepositions like to (20) and of (6).


Figure 10: speak in the BNC








Taking all the previous concordance figures presented above into consideration, it appears that talk and speak have many things in common. For instance, both words are frequently preceded by the preposition to in order to present the verb of speaking or talking. Moreover, they are both followed by a shared preposition like to, of, and sometimes about. They also both appear the most in the same genres; fiction and spoken language. However, as Langendoen indicates, “SPEAK is a general term of wide application. It may on occasion differ from TALK in suggesting a weighty formality. TALK in general may suggest less formality and is likely to implicate auditors and interlocutors” (1974, p. 14). In other words, they only differ in the selection restriction. Figures 11 and 12 both show the domains which talk and speak are frequently used as verbs.


Figure 11: talk in the BNC











Figure 12: speak as a verb in the BNC










As shown in Figures 11 and 12, talk appears more frequently in the spoken domain because it is less formal and rather more casual. Speak on the other hand is used heavily in fiction because it is more formal, and hence is used in writing more often.



This paper reported a corpus-based investigation in terms of differences in patterns and uses between the two verbs, talk and speak, using the corpus linguistic techniques of frequency lists and concordance production. Two hypotheses were presented in this investigation; the first hypothesis was related to the frequency of using the word talk in comparison to the word speak, which after looking at all the figures proved that talk is frequently more used than speak because it is less formal and refers to a casual act of speaking. Speak on the other hand tend to be used more in the written domain; as a result, the second hypothesis suggesting that the two phrases are very similar, and almost interchangeable without changing the meaning is automatically refuted. The investigation provides qualitative linguistic data that support both hypotheses.




Langendoen, D. Terence (1974) Speak and talk: A vindication of syntactic deep structure. On Language, Culture and Religion (Festschrift for Eugene A. Nida), ed. by M. Black & William A. Smalley, 237-240. The Hague: Mouton.



Sarah Abdelbary is a senior at the American University of Sharjah majoring in English Language and minoring in Translation. She is originally Palestinian but was born in the United States and has lived her whole life in the United Arab Emirates. She would like to pursue a career rather than a higher degree for the time being. She is looking to expand her experience in translation; however, as she would not like to restrict herself to one field she is also interested in media or marketing. She considers herself a hardcore adrenaline junkie.