Volume 41, Issue 7 pp. 1750-1774
SPECIAL ISSUE ARTICLE
Full Access

English as a lingua franca: Facts, benefits and costs

Jacques Melitz

Jacques Melitz

CEPII, CREST, CEPR, Paris, France

Search for more papers by this author
First published: 06 April 2018
Citations: 19
This paper is a revision of Melitz ( 2016). Reproduction of large parts of the chapter is with the permission of Palgrave Macmillan. The paper reports on much concurrent work in Melitz and Toubal ( 2014) and Ginsburgh, Melitz, and Toubal ( 2017), and as a result, parts of Sections  and should be considered co-authored.

Abstract

How much further can we expect the spread of English to go? What are the gains? What are the costs? The paper first tries to identify the areas of life where English already serves as a lingua franca in the world and those where the language faces sharp competition. The discussion goes on to show that the future advance of English will depend heavily on the motives to learn the other major languages in the world as well. The cultural market is the single one where the extraordinary progress of English threatens to go too far.

1 INTRODUCTION

There has never been in the past a language spoken more widely in the world than English is today. How far has English already spread? How much further can we expect it to go? What are the welfare implications? On the first two questions, a popular book by the linguist Crystal (2003) titled English as a global language is extremely useful. So are two works by the sociologist Graddol (1997, 2006) that were commissioned by the British Council regarding the future of the English language. As regards the facts, I will try to move beyond these two authors mainly in connection with publishing, foreign trade and language learning. We know from economic research that common languages promote bilateral trade between countries. We also know from many sources, including survey evidence of exporting firms, that trade stimulates language learning. There is recent econometric support. To what extent does the expected growth of world trade in the future imply the further spread of English? To what extent does it instead imply limits to the expected spread of English because of the similar inducement to learn other languages as well? Do other factors besides trade also play an important role? Beyond these questions, the paper will respond to the evident concern today about the welfare impact of the spread of English on native speakers of the other major world languages. One example, of course, is the well-known French support of “l'exception culturelle” [the cultural exception] in the field of trade. It is indeed surprising to what extent English dominates in certain cultural areas, including the song, the film and the best-seller, not necessarily publishing in general. Is there possibly too much English in some areas for the good of mankind?

The next section will concentrate on the facts about the spread of English. Specifically, it will try to identify the areas of life where English already serves as a lingua franca in the world and those where the language faces sharp competition instead. One of the areas where competing languages set important boundaries to the spread of English is international trade, to which a separate section will be devoted. The next section will deal with the evidence about language learning. The final section will centre on the welfare aspects.

2 THE STATUS OF ENGLISH AS A GLOBAL LANGUAGE

2.1 The world distribution of native and spoken languages

We may begin with approximate numbers for native and total speakers of English and the other principal languages in the world. Table 1 provides such figures for the 12 largest languages in terms of total speakers. Turkish/Azerbaijani/Turkmen, Italian and Dutch/Afrikaans are added to these 12 languages even though they are not exactly the next three largest because they enter in a study of language learning on which I shall report prominently later on. The table shows two sets of estimates, one from Ethnologue languages of the world (Ethnologue) and the other from Melitz and Toubal (2014) (hereafter MT), the source of the data for the relevant study of language learning. The figures from Ethnologue (obtained on the Internet in May 2013) apply to different dates. To give an idea, they go back to 1998 for Arabic and 2001 for English and refer to census data for 2000 for Chinese, 2006 for French and data as recent as 2010 for Spanish. The data from MT (which omit two of the 12 largest world languages, Hindi/Urdu and Bengali), all collected in 2010, are mainly bunched around 2001–2007. For the moment, it is not possible to produce such a table for a uniform year. All such tables rely on the assumption that language is a slow-moving variable, which is of course more reliable for native than for spoken language.

Table 1. Worldwide speakers (millions)
Language Ethnologue Ethnologue MT MT
Native speakers Total speakers Native speakers Total speakers
(1) (2) (3) (4)
Chinese 1,197 >1,197 1,161 1,165
English 335 >765 357 1,123
Spanish 406 466 401 479
Hindi/Urdu 324 >387
Arabic 206 246 244 272
Russian 162 272 184 267
French 68 118 69 260
Bengali 193 250
Portuguese 202 217 209 222
German 84 112 89 168
Malay 23 >163 22 158
Japanese 122 123 126 126
Turkish/Azerbaijani/Turkmen 83 >83 91 102
Italian 61 >61 64 77
Dutch/Afrikaans 28 >28 22 37

Notes:

  • a Chinese and Arabic are “macrolanguages” in Ethnologue's terms, ones that group together native speakers of distinct and often mutually unintelligible dialects. These are single languages only on the basis of custom and the tendency of native speakers to identify themselves with the general label (Mandarin serving as the main reference point for Chinese, standard Arabic for Arabic). The 1,197 figure for Chinese combines Mandarin (population 847.8), Gan Chinese, Hakko, Huizhou, Jinyu, Min Bei, Min Dong, Min Nan, Min Zhong, Pu-Xian, Wu (Shanghainese, 77.2), Xiang and Yue (Cantonese, 62.2). While Ethnologue cites a figure of 178 million L2 speakers for Mandarin, the vast majority of these are native speakers of a separate Chinese dialect and I have no way of knowing how many are so rather than native speakers of a foreign language (although I believe the number is small). This explains the inscription >1,197 in column (2). As regards Arabic, Ethnologue draws upon Wiesenfeld (1999), a world almanac.
  • b The “>” inscriptions in column (2) call for explanation. As regards the >387 for Hindi/Urdu, Ethnologue cites a precise figure of 120 million for L2 speakers of Hindi, but this figure includes an uncertain number of the 57 million Urdu speakers. That is why I have added a total of 63 million (120 minus 57) to 324 million in this column, and the 387 figure therefore yields a minimum. As concerns the other five “>” signs, Ethnologue explicitly says “over 430 million” for L2 for English, “over 140 million” for Malay (cited under languages of Indonesia rather than Malaysia), and does not provide any L2 figures for Turkish/Azerbaijani/Turkmen, Italian and Dutch/Afrikaans.

There are some large discrepancies between Ethnologue and MT calling for explanation. Regarding spoken English, the principal cause is that Ethnologue draws figures from an incomplete table in Crystal (2003) (Table 1) that concerns only “territories [75 of them] where English has held and continues to hold a special place” (Crystal, 2003; p. 60), by which Crystal evidently means, almost without exception, territories that were under the administrative control of English-speaking countries at some time in living memory or else where the language is official or both. Consequently, those figures do not include spoken English in places such as the Netherlands, Germany and the Scandinavian countries, where the language is widely spoken but has never been either that of the ruling political power in the country or official. The MT data largely replicate the data in Crystal's table for the same 75 territories he mentions (except for the tiny territories that are not in their study), but add data for spoken English in other parts of the world wherever the authors could find it. MT generally draw no distinction between speakers in countries where the language has “a special place” and other countries. One important source of their data is an EC 2005 survey of (self-reported) ability to hold a conversation in foreign languages in the 28 current (2014) members of the EU plus Turkey (Eurobarometer, 2006). Ethnologue generally tends to collect data for spoken language by non-natives (termed L2) from a selection of territories in Crystal's sense (although without being explicit). This explains the wide discrepancies between Ethnologue and MT for spoken French and German as well as for English. In the case of Turkish/Azerbaijani/Turkmen, Italian and Dutch/Afrikaans, Ethnologue offers no L2 data. See the note to Table 1 about Chinese and Arabic.

There are more native speakers of Chinese than any other language by far. However, based on Table 1, English is neck and neck with Chinese as world leader in total speakers. All other languages lag far behind. In fact, it would be easy to propose figures for English speakers far exceeding those for Chinese by extrapolating on the basis of attendance in English classes and/or some ability to understand. On such grounds, one could readily add an extra 300 million speakers of English for India and China alone (see, e.g., Kachru, 2010; and for India, see also Crystal, 2003). The MT data, like their basic sources (prominently including Ethnologue), is more conservative, and at least in the authors’ minds, reflects an ability to converse.

Most important of all, English is way ahead of all other languages as a learned language by non-native speakers, and it is the only one to be well represented in all five continents. In terms of geographical dispersion, only French comes anywhere close to English but is still far behind.

2.2 Areas where English serves as a lingua franca

There are situations where reliance on interpreters and translations may even be dangerous for safety. Control towers must be able to communicate with air pilots instantly. Commanders of modern vessels at sea must be able to communicate with one another rapidly. Accordingly, in recent decades, active steps have been taken towards a single world language in the field of international safety. An English vocabulary named “airspeak” has been progressively adopted by over 180 countries based on the recommendations of the International Civil Aviation Organization. There is also a limited vocabulary, “seaspeak,” based on English that has been adopted by the International Marine Organization.

In some other cases, multiple languages are not lethal but prohibitively expensive. Meetings of organisations with large international memberships could take place with simultaneous translation or, perhaps in the future, through automatic translation based on voice recognition. For the moment, though, this would be at excessive cost. In many instances, the mere publication of all the information that international organisations generate in the respective native tongues of the members is almost unimaginable. Thus, nearly by necessity, international political organisations tend to choose a limited number of official languages. Under the UN charter, there are five: Chinese, English, French, Russian and Spanish; Arabic was added as a sixth in 1973. The official language of the IMF and the World Bank, both in Washington, is English. French is a second official language of the OECD (located in Paris) besides English, but it is possible to get along in the organisation without French but not without English. Unsurprisingly, the only official language of the Francophonie is French. Similarly, the official languages of Mercosur are Spanish and Portuguese, not English. Yet these are exceptions. Generally, English tends to be at least one of the official languages of international political organisations. Interestingly, English is even the only official language of a couple of major regional political associations outside of Europe or North America: namely OPEC and the South Asian Association for Regional Cooperation. There is only one international political organisation that recognises numerous official languages but at a notoriously high cost, the European Union (Fidrmuc & Ginsburgh, 2007). Even there, the organisation adopts English, French and German as the only “working languages.”

In the case of private rather than public international associations, where the biblical problem of Babel rears up just as well, information is less readily available. Based on some sleuthing work, Crystal (2003) calculates that in 1995–96, 85% of such organisations made some official use of English. French was second with 49%, and only Arabic, Spanish and German scored over 10%. Quite tellingly, he also finds that for European associations as such, English is an “official or working language” in 99% of the cases, and English plus French plus German is “the most popular European combination.”

There are other areas where English serves largely as a lingua franca in the world. The supply of international news and international sports are two (see Graddol, 1997, 2006). People in the business of diffusing international news, or the firms active in the diffusion process itself, must obtain their information quickly. As a result, they have veered heavily towards English in data transmission among themselves. In close connection, there is a heavy concentration of providers of international news in English-speaking countries, including Reuters, the Associated Press, the BBC, NBC and the New York Times. As relevant too, Crystal (2003) traces the early development of the news industry largely to the English-speaking world. International sports also require a lingua franca for the minimal communication necessary when competing athletes and an umpire meet on a playing field.

Perhaps the most intriguing area of English as a lingua franca is science and scholarship since in this case the basic mechanism at work is the self-interest of the individual scientist and scholar. There have been studies of publications in the natural sciences from 1880 to 1996 based on American, French, German and Russian bibliographies. The results, which are drawn from original work by Tsunoda (1983) and Ammon (1998), are summarised in Hamel (2006). They show only 36% of publications in English in 1880, followed by a rise to around 50% in 1940–50 to 75% in 1980 and 91% in 1996. The 91% average for 1996 covers a top of 94%–95% for physics and mathematics and a low of 83% for chemistry. The information for the social sciences and humanities starts in 1974 and goes up to 1995, and the trend proceeds from 67% in 1974 to around 70% in 1980 and on to 83% in 1995. Therefore, by 1995 the social sciences and humanities attained the same level as chemistry in 1996. The decentralised nature of the incentives to publish scientific work in English is clear. Truchot (1996) spells out the tendency for natural scientists in France to switch from publishing in French to English since the mid-1970s in the face of political pressures to stick to the home language. In addition, all the top journals in sociology, political science and economics are in English and the social scientist confronts the same incentives to publish in English as the natural scientist in much of the world. Even if one publishes occasionally in one's home language, those who seek an international reputation mostly try to publish in English, especially their best work.

2.3 Areas of puzzling English supremacy

There are, however, some puzzling areas of English supremacy. The song, the motion picture and the best-seller are three. Native speakers of foreign languages who consume English songs do not even necessarily understand the lyrics. As for the film and the best-seller, the content must be dubbed or translated (subtitled) from the English when it is sold to foreign-language speakers. Thus, it is futile even to try to explain the extraordinary supremacy of English in these three areas on the basis of benefits of a lingua franca.

The facts are striking. The list of best-selling songs since 1942 includes only four physical singles that are not sung in English out of the 126 that sold over five million copies (one Portuguese, one Japanese, one French, one Italian) and only five not sung in English out of the 98 digital singles that did the same (one Portuguese, two Japanese, two Spanish) (personal count based on Web search). As regards Europe, of the current (late 2013) 100 best-sellers, all of the top 20 are sung in English and only six out of the top 40 are sung in different languages. Only for the rest of the top 100 does the count begin to even up between English and all other languages together. This is not true for Latin American countries where Spanish and Portuguese stand up well to English even over the top 20 or so. There are also some countries in the rest of the world, especially in Asia like Japan, where the home language dominates in the top ten.

In the case of the top-grossing foreign-language films at the box office all-time, not a single non-English-language film shows up in the top 500 (Box Office Mojo). The non-English-language film that grossed the most (Crouching Tiger, Hidden Dragon) earned 62% of the receipts of the 500th on the all-time list. For each of the five most recent years (2008–12), not one non-English-language film shows up in the top 20 biggest box-office hits. In the case of books, the best-sellers of all time include exactly two out of the top 100 that were originally written in a non-English language (Web search). One is the Swedish trilogy by Stieg Larsson, published posthumously in English, The girl with the dragon tattoo, The girl who played with fire and The girl who kicked the hornets' nest. The other is Paulo Coelho's The alchemist, originally written in Portuguese. (Specialists in literature may choke).

We will come back to the arresting dominance of English in the previous cultural areas in connection with welfare.

3 AREAS WHERE ENGLISH FACES SHARP LIMITS

It is important, next, to discuss areas where English, although in the lead, faces sharp competition. The relevant areas are wide and cover the daily or weekly press, television, the Web, publishing and trade. When it comes to trade, the spread of English encounters limitations not only for consumption but investment goods and therefore in strict communication between firms. These areas of activity serve to underline the exceptional situation for the song, the film and the best-seller, where the goods must also meet the market test of individual consumer satisfaction and where the pressure to meet this market test is not even necessarily weaker.

3.1 The press

The English press has a wider international presence than others, but the daily presses of the world plainly reflect the native languages well. Japan boasts five or six of the ten newspapers with the top world circulation, all in Japanese; the enormous Chinese press is predominantly in Chinese; the German press is in German; the Italian one in Italian, the Czech one in Czech, etc. The newspaper with the single largest world circulation, published in India, The Times of India, is in English. Yet Hindi is also well represented in the Indian press. Admittedly, there are many English-language newspapers that are published outside the major English-speaking countries, like the China Daily (published in Hong Kong), and the single major world newspaper essentially intended for international distribution is in English: namely the International New York Times (only recently the International Herald Tribune). Still, there can be no claim of English dominance outside of native-English countries as regards the press.

3.2 Television

The story for television is more interesting. There had been a scarcity of broadcasting space for transmitting TV signals in the air in the 1960s when technological innovations revolutionised the industry. First, cable TV, next the launching of satellite TV, following the cabling of satellite TV, and most recently digital compression, led to a wealth of television channels. The United States developed TV earlier than the rest of the world, partly because of its huge home market. Therefore, the country was able at first to furnish viewing content to foreign TV channels, hungry for material, at lower cost than those channels were able to produce the material for themselves. There resulted an enormous trade surplus in audiovisual material for the United States in the 1970s and 1980s (The Economist, 1997). A highly influential UNESCO study in 1983, updating a similar study in 1973, showed that the earlier one-way flow of programming material from the United States to the rest of the world (Varis, 1984) had continued. The programming material consisted heavily of entertainment. Overall, a third or more of the total programming time in the rest of the world came from the United States in 1983, with important differences by country and region. For example, 75% of the viewing content in Latin America, 44% in Western Europe and 21% in the Arab countries came from the United States. The United States, in turn, hardly imported from anyone. There was talk at this time of US dominance and “hegemony” (Biltereyst, 1991; Tracey, 1985). (This was the time of the world success of the US television series Dallas.) However, studies soon showed that TV audiences everywhere preferred home-made material, and as TV matured and foreign TV audiences grew, home-made material progressively replaced imported one and several observers correctly predicted that the trend would continue in many parts of the world (Berwanger, 1987; Hoskins & McFadyen, 1991).

Graddol (1997) has extended the narrative since the development of satellite TV. Once again, the early development was largely American and fed fears of massive spread of English and US culture at the expense of other major languages and other cultures. However, the 1980s and 1990s saw the arrival of large non-US participants in satellite TV, like Arte and Euronews (which now broadcasts in 15 languages). More tellingly still, the English providers started to broadcast in other languages and to adapt the language to local preferences. Thus, after beginning broadcasting from Hong Kong exclusively in English and Chinese, Star TV began to promote local Asian languages. Likewise, CNN International launched a Spanish programme in Latin America. Writing in 1996, Graddol predicted (p. 60): “National networks in English-speaking countries will continue to establish operations in other parts of the world, but their programming policies will emphasize local languages.” On the other hand, he says (2006, p. 46): “English, however, remains the preferred language for global reach.” The several efforts under way now to broadcast internationally by non-English channels have adopted the model of the German Deutsche Welle, which decided right from the start to broadcast both in German and English.

3.3 The Internet

The story of the Internet largely follows the same script as for TV. The Internet began predominantly in the United States, and there was a strong sense in its early days in the 1980s that it would spur the learning of English. There is probably some truth to this, but what we have seen mostly since is a progressive tendency to make the Internet available in other languages. Until the 1990s, the only script available on the Internet was ASCII, the American Standard Code for Information Interchange, which allowed only 95 printable characters (plus 33 controls such as plus signs). No foreign accents or diacritical marks were possible. However, Unicode followed, remedying the problem, and by September 1999 allowed 38 different scripts and a total of 49,259 characters. In its most recent version (6: 2), dating 2010, Unicode allows 100 scripts, almost all those in current use, and contains 110,182 characters. Crystal (2006) points out nicely that far from contributing to the death of languages, the Internet helps to preserve them by permitting more of them to be stored permanently.

The Internet World Stats website brings together world data about Internet usage and population statistics. Table 2, based on this site, summarises the essential failure of English to crowd out other languages on the Internet since the year 2000. The first column of the table shows the top nine languages on the Internet. The second column gives the estimated number of users for those nine languages in 2011. The results say that the top nine languages account for 80.2% of all users. (I counted a total of 163 languages on the Internet (including Latin and Esperanto) from one of the original sources, Web Technologies Surveys.) The third column gives the growth of the number of users of the Internet 2000–11. There we see the massive catch-up of English by Arabic, Russian and Chinese since 2000. Column (4) reproduces the information in column (2) as ratios of the world total of users (2.1 billion). Column (5) reproduces the same ratios instead on the basis of the last column of Table 1. This is done by dividing the numbers for the nine languages in the column by their sum, 4,082 (in millions), and then multiplying all of them by 80.2%, so that the percentage total is the same as in the preceding column (4). The ratios for the individual languages in columns (4) and (5) are remarkably close. Therefore, the relative order of the nine largest languages on the Internet now resembles their position in terms of relative numbers of speakers, even if Spanish, Arabic, French and Russian trail somewhat and English is disproportionately ahead by 4.7%. Obviously a massive adjustment of relative positions must have taken place in 2000–11 to permit this result (see column (3)), but it has already happened.

Table 2. Top nine languages used on the Web
Top 9 languages in the Internet Internet users by language (millions) 2000–11 growth on Internet (%) Internet users (% of total) MT speakers (% of total)
(1) (2) (3) (4) (5)
English 565 301 26.8 22.1
Chinese 510 1,479 24.2 22.9
Spanish 165 807 7.8 9.4
Japanese 99 111 4.7 2.5
Portuguese 83 990 3.9 4.4
German 75 174 3.6 3.3
Arabic 65 2,501 3.6 5.3
French 60 398 3 5.1
Russian 60 1,826 3 5.2
Top nine languages 1,682 421 80.2 80.2
Rest of languages 418 588 19.8 19.8
World total 2,100 482 100 100
  • Source: Internet World Stats.

3.4 Publishing

Walk into any bookstore almost anywhere on earth and you will find before you predominantly titles in the main local language. However, regardless where you are, there may well be an English section, maybe prominently in sight. (Alert: the subject now is the number of titles, not the number of best-sellers, which depends on sales of individual titles.) Since people like to read in their home language for pleasure, just as they like to read newspapers in their home language and they like to hear their home language on television, the bookstore will be stacked mostly with titles in the home language. Of considerable note, though, the clientele for books is also smaller than the one for newspaper, television and Internet; it is more educated; and it is more likely to read a foreign language in the original both for profit and pleasure. Since English is the world's most prominent lingua franca, we might therefore expect English to be disproportionately well represented in edition.

In earlier related research, covering data from 1971 to 1991 (Melitz, 2007), I found support for this view, and in a renewed study of the same question, based on the latest information, the support appears even greater. According to the UNESCO annual Statistical Yearbook, the principal data source, the ratio of published English titles to total world titles around 1971 was 24% (1973 Yearbook). Total world population at the time was 3.8 billion. 24% makes 900 million, which is far above the then-current English-speaking population. Based on Table 1, the world ratio of English speakers attained around 17% in the vicinity of 2005 and must have been lower in 1971. Therefore, English was disproportionately represented.

An interesting development took place in the next two decades. As English penetrated worldwide, the ratio of English titles to the total fell to 21% around 1981 (1984 Yearbook) and 20% around 1991 (1995 Yearbook). This was probably a reflection of an improvement in education and literacy rates in the non-native-English part of the world, together with faster population growth in this part of the world than the native-English one. However, according to latest results, for circa 2009, the forces enhancing English relative to other languages have taken over in publishing since the eighties. The ratio of publications of English titles to the world total rose to around 30% in 2009, well up from up from 20% in 1991, and quite high relative to the ratio of around 17% for English speakers relative to total world population circa 2005. Therefore, English does indeed now occupy a highly disproportionate place in world edition although other languages are still well represented.

The prominence of English for translations is more pronounced than for the rest. Already for 1960 to 1987, the UNESCO yearbooks show that the share of English in translations as a source language was 41% in 1960 and then rose to 56% in 1987 even though the percentage of English in total titles fell from 24 to 20%–21%, as we saw. Correspondingly, the UK and the US publishing industry took little interest in translation. As the world ratio of translations to total titles went down from 9% to 7% during the 1960–87 stretch, the British publishing of translations rose from a bare 2.1% to 3.7% and the US one dropped from 6.6 (an exceptional post-war high) to 3% (see table 2 in Melitz, 2007). In striking contrast, European countries were and remain unusually interested in translations. A study sponsored by the Commission of European Communities (BIPE, 1993) shows a ratio of 17% for translations for the then members (incorporating a 3% figure for the UK) in 1991 (see table 3 in Melitz, 2007). For the literature classification of titles, the ratio of translations to total titles was almost twice as high, 32% (including 4% for the UK). Ginsburgh, Weber, and Weyers (2011) provide highly confirmatory findings in a study exploiting the UNESCO Web source, Index translationum. They report translations out of English 22.09 times higher than translations into English for publications in the literature classification covering 19 Indo-European languages in 1979–2002. The next highest ratio in their sample, for Swedish, is 1.17 to 1 (and clearly reflects an unusually high level of translations of Swedish into Danish and Norwegian). Russian and French are the only other two languages besides Swedish for which the ratio is even close to 1 rather than well below it. See the opening chapter by Allen (2007) in a collective work sponsored by the international literary association PEN, for an interesting and compelling account of the worries of specialists in literature about the low level of translations into English in the literature classification.

Nevertheless, I include translation as an area where English faces sharp limits. The reason is the visible interest in translations of original works in non-English languages outside of the native-English countries. In the aforementioned BIPE (1993) study, about 43%–49% of the translations in the literature classification in France, Italy, Spain and Greece are from other source languages besides English and the figures are in the range of one-third for Germany, the Netherlands and Denmark. To all indications, therefore, it is not translations that are nearly closed to authors who write in other languages than English but translations into English.

A partial explanation which is inspired by Ginsburgh et al. (2011) may be offered. Based on familiar reasoning in economics, suppose that all readers like a variety of titles for reading pleasure but exhibit home preference, which means a preference for reading works that were originally written in their home language. This agrees with the estimate of only 7% for translations relative to total titles for the world as a whole in 1991, 17% for the European countries in the BIPE sample in 1993 and the high of 31% in this last sample for titles belonging to the literature classification. Suppose, in addition, that the degree of home preference rises with the size of a language community. In other words, large language communities are more insular than small ones: the small ones are more interested in different cultures and the outside world. Then, the arithmetic goes in the right direction. For example, suppose that home preference is twice as high in a language community that is twice as large as another. Then, 90% home preference in the larger language community would mean only 45% home preference in the smaller one. That makes for 10% translations relative to total titles in the larger community as opposed to 55% in the smaller one. It is easy to see that this hypothesis will go a long way towards reconciling the data with consumer sovereignty. If so, then in terms of standard economic reasoning, there is no market failure.

Notice, though, that the reasoning does not even begin to address the evidence for best-sellers. In the previous example of one language community twice as large as the other, if in both communities home sales are the same on average for all works in the home language that are translated into the foreign language as they are on average on all works in the foreign language that are translated into the home one, then on average the total world sales of a translated work originally written in the large language and the small language would be exactly the same. In both cases, the total of home sales plus foreign sales would be the sum of the same two numbers. To take a numerical example, suppose that the average title that is written in the home language in the large language community sells 1,000 copies at home (and this is true for those that are translated into the foreign language as well) and the average title that is translated into the home language also sells 1,000 copies in this community. Suppose next that the corresponding two numbers are 500 in the small language community. Consequently, world sales of a title that is translated and originally written in the large language would be on average 1,000 at home and 500 abroad, and world sales of a title that is translated but originally written in the small language would be on average 500 at home and 1,000 abroad, for a total of 1,500 in both cases. The issue of the best-seller concerns the distribution of sales of individual works (rather than the number of titles), and more specifically the upper tail of the distribution. The extraordinary dominance of English in best-sellers remains as puzzling as before.

4 TRADE

It is intuitive that a common language would boost trade, especially for goods that are not perfectly homogeneous and demand investigation. A series of questionnaire surveys of investment in language skills by exporting firms in Europe confirms the intuition. The series began in 1996 and culminated in a large study in 2005 commissioned by the European Commission from CILT (a British organisation focused on foreign-language skills in business). This study, by Hagen, Foreman-Peck, Davila-Philippon, Nordgren, and Hagen (2006), covers a sample of 2,000 small and medium-sized exporting enterprises (SMEs) in 29 European countries, including Turkey, and 30 large multinationals (MNEs), all home-based in France (see annex 4 of the study). To the question “Has your company undertaken foreign-language training in the last 3 years?” 35% of the 2,000 SMEs answered yes. The percentage of these 2,000 firms foreseeing a need to acquire additional expertise in foreign languages in the next 3 years is higher, 42%. If anything, the MNEs are more conscious of the importance of investing in linguistic skills than the SMEs. 60% of the MNEs recognise deficiencies. This is below the 75% figure in a previous study with similar aims covering 151 multinationals with a broader international distribution of home bases, including the UK, Germany and a sprinkling of other countries besides France. See Feely and Winslow (2005), another CILT publication, and also Bel-Habib (2011).

The previous survey evidence, however, does not tell us much about the impact of a common language on bilateral trade. For such further knowledge, the gravity model of international trade is better suited. Somewhere in the early 1990s, researchers at the World Bank began introducing common language into this model—which has since become very popular (only since)—as a factor promoting trade in merchandise between country pairs along with the other two prominent factors reducing trade frictions in the model, namely geographical proximity and a common border (see Foroutan & Pritchett, 1993; Frankel, 1997; and Havrylyshyn & Pritchett, 1991). (Studies of bilateral trade in services remain a serious empirical challenge because of inadequate statistics.) This was done by introducing a binary variable equal to 1 if a country pair possesses a common language and 0 otherwise. The early uses effectively depended on peer acceptance and simply avoided ambiguous cases like France/Switzerland. The variable was highly successful. At first this was interpreted to reflect the importance of cultural proximity in trade with some acknowledged confusion between language and ex-colonial relationships. But the confusion was soon resolved by introducing ex-colonial relationships concurrently (through one or two additional binary (0 or 1) variables). Since 2000 the use of the common language variable has become pervasive because of the statistical significance of the variable and the intuitive sense of its importance. Systematic use of the variable was also made possible, almost regardless of country sample, by the adoption of a common official language as the criterion. Prominent estimates of the coefficient of common language based on this criterion are near 0.5. There are already two meta-analyses of results of studies using common language (most of which rest on the preceding index but not all of them): Egger and Lassmann (2012) resting on 81 studies and Head and Mayer (2013) resting on 159. Both papers come up with estimates of around 0.4–0.5 for common language. Since the dependent variable in the estimates is the log of bilateral trade, 0.4 corresponds to about 50% more trade between a country pair with a common language than a pair without one (exp(0.4) ≅ 1.5–1). This is a large effect. Yet upon reflection, the estimate could be too low since it depends on a flawed measure of common language, and measurement error tends to bias estimates downward.

Consider, for example, the set of 28 countries covered in the aforementioned Eurobarometer (2006) survey of language skills. In this sample, English is the official language of a single pair: the UK and Ireland. Therefore based on the measure, common English boosts bilateral trade strictly for this one pair in the entire sample whereas, of course, knowledge of English is widespread in the sample and is also uneven. A similar problem arises for common German, which, according to the measure (or at least the standard application of it), exists strictly between Germany and Austria, whereas German is widely spoken in Denmark (where it could also be considered official) and the parts of Eastern Europe in the sample. There are other reasons for scepticism. Official languages are about as common across countries as national flags. As nearly inevitable, official languages sometimes arise despite moderate or low levels of speakers if linguistic diversity is high. Thus, French and English are official in a good number of African countries where the percentage levels of speakers of either language are below one-quarter and not infrequently below 10%. Spoken English is also only around 20% in some Pacific Islands where the language is official, like Fiji, Kiribati and Mauritius. Quite apart from statistical bias, the earlier issue I raised, if we consider the many reasons why a common language might promote bilateral trade, ranging from ethnicity, tastes and trust to ease of communication and facility of obtaining written or spoken translations, one can easily question that a measure based on official status alone would cover the subject. A recent study, born of such doubts, uses four separate measures of a common language simultaneously: common native language, common spoken language and linguistic proximity together with common official language (Melitz & Toubal, 2014). The resulting estimate of the impact of a common language goes up from around 0.5 to about 1.1. On this last estimate, a common language raises bilateral trade by 200% (exp(1.1)-1 ≅ 2). All four measures are also simultaneously important.

The natural question to ask here is whether English promotes bilateral trade more effectively than other languages. Table 3 focuses on this question based on the MT study. The relevant gravity equation covers 195 countries over the 10 years 1998 through 2007 and the dependent variable is (the log of) bilateral trade. The estimates also employ separate exporter-year and importer-year fixed effects so that the results depend entirely on the cross sections. There are bilateral controls for distance, common border, ex-coloniser/colony relationship, ex-common-coloniser relationship, common religion, common legal system and years at war since 1815. All the controls enter highly significantly different from zero in the estimates and carry the expected signs. But in the interests of space, Table 3 reproduces strictly the coefficients and t values of the linguistic variables. As evident from column (1), all four linguistic influences are highly significantly different from zero, all but common native language at the 99% significance level. In addition, the sum of their effects adds up approximately to 1.1, as mentioned above. One might worry that these estimates are subject to simultaneity bias because of the reciprocal effect of bilateral trade on common spoken language. For this reason, the paper also provides estimates founded strictly on common official language, common native language and linguistic distance (in which case those variables partly reflect common spoken language, with which they are highly correlated). The estimates are about the same.

Table 3. Linguistic influences on bilateral trade
(1) (2) (3)
Common official language (0,1) 0.351 0.405
(7.56) (5.64)
Common spoken language (0–1) 0.396 1.244
(4.91) (8.55)
Common native language (0–1) 0.284 −0.379
(2.34) (−2.24)
Linguistic proximity (0–1) 0.078 0.06
(4.26) (2.89)
Common official language: English (0,1) 0.084 −0.237
(1.42) (−2.66)
Common spoken language: English (0–1) −0.034 −1.447
(−0.35) (−8.38)
Common native language: English (0–1) −0.001 0.763
(−0.01) (3.17)
Linguistic proximity to English (0–1) 0.092 0.083
(2.89) (2.32)
No of observations 209,276 209,276 209,276
Adjusted R2 .757 .755 .758
  • Notes: Student ts in parentheses. ***< .01, **< .05, *< .1. Downward correction of SE for country pairs that appear twice the same year, with the opposite country as the exporter (2,850 clusters). 0–1 signifies continuous values between 0 and 1; 0,1 signifies binary values either 0 or 1. Numerous controls are not reported (see text).
  • Source: Melitz and Toubal (2014).

Columns (2) and (3) concern the separate impact of English. Column (2) does so by dropping all the languages except English from the analysis. To see the interest of taking this step, suppose that the results of column (1) depended on English alone. In that case, the measures of the linguistic variables in column (2) would be superior. They would simply remove errors of measurement and should yield higher and better estimated coefficients. However, if instead the measures of a common language in column (1) are the right ones, then to the contrary, the measures of linguistic influence in column (2) would be noisy and should yield lower and less well estimated coefficients than the previous ones. Indeed, in this last case—that is, if the broad measures of a common language are the appropriate ones—there are two reasons why the English-based measures of the linguistic variables might perform particularly badly. In the first place, an English-speaking country has a great many solutions for skirting the language barrier altogether. There are many other English-speaking countries with which it could trade. Therefore, common English could be an especially weak spur to trade with any single common-language partner. Alternatively, a country speaking Portuguese, for example, would have far fewer alternative partners with which to trade in order to avoid the language barrier and therefore might exploit those opportunities more intensely. On this ground, the coefficients of the linguistic variables based on English alone might be exceptionally low apart from measurement error. The second problem could be equally serious. Relying on English alone means drawing numerous distinctions between country pairs who share a common language other than English based upon their English, and proposing a quantitative ordering of linguistic ties between these non-English pairs based on their common English alone. Especially large distortions might arise. The results in column (2) confirm the broad suspicion from these considerations that basing common language on common English alone may lead to particularly poor estimates. The effect of common English does not even show up, except curiously for linguistic proximity.

Column (3) proceeds, probably more reasonably, by examining whether adding the linguistic effects of English separately improves the estimate. Once again, the result is negative. But in this case, we get very confusing outcomes except for linguistic distance. The effects of a common official language, a common spoken language and a common native language for the broad measures go in opposite directions to those for English alone, and in addition, the coefficients of the previous three measures are largely implausible, especially for spoken language where the negative effect exceeds the positive one. Furthermore, in the case of official and spoken language the signs for English alone are negative, which would imply a negative correction for English, whereas for native language the sign for English is positive, implying a positive correction for English. None of this makes sense. All in all, the clearest conclusion from column (3) is that the effort to estimate the impact of English separately fails, just as it fails in column (2). Column (1) offers the best estimate.

Of course, this is not to deny that English probably has a larger effect on multilateral trade or trade with everyone than common Russian, common Spanish or any other common language. This is true simply by virtue of the language's size. World trade by native-English speakers is around 23% of total world trade. The next highest ratio, for Chinese, is 11%, followed by Spanish (10%) and German (9%). For this reason alone, English ought to contribute more to the reduction of linguistic trade frictions with the rest of the world than any other language. However, even on this reasoning, English contributes most to multilateral trade only on average. For Portuguese-speaking Brazil, situated in South America, there may well be a greater incentive to learn Spanish than English to promote multilateral trade. The same goes for Russian in Eastern Europe or around Kazakhstan, and for Chinese or Malay in South-East Asia, etc.

The earlier survey evidence supports these last inferences. Of the 2,000 small- and medium-sized exporting enterprises (SMEs) in the Hagen et al. (2006) study that report foreign-language training in the last 3 years (constituting 35% of the sample, we may recall), only a quarter provided it in English. 18% provided the training in German, 15% in French. In addition, three small languages figure in the top 10: Czech (5%), Danish (3%) and Estonian (3%). The following quote (Hagen et al., 2006, p. 19) is especially noteworthy:

When companies were asked to identify the languages they used in their major export markets it was apparent there is widespread use of intermediary languages for third markets. For example, English is used to trade in over 20 different markets, including the four Anglophone countries, UK, USA, Canada and Ireland. German is used for exporting to 15 markets (including Germany and Austria), Russian is used to trade in the Baltic States, Poland and Bulgaria and French is used in eight markets (including France, Belgium and Luxembourg).

Thus, even when it is a question of a third language as an intermediary in trade, English is not necessarily the choice.

Unsurprisingly, the special role of English in international commerce emerges best in the survey evidence of the multinational enterprises (MNEs). These firms do indeed have a striking preference for English in their communication with customers and their own subsidiary companies. 63% of them prefer English and another 20% a mix of English and a home language, basically French in the particular sample (Hagen et al., 2006, pp. 42–43). This is easy to interpret. These firms have the greatest use for a lingua franca as a coordination device. They face a language problem internally and not only in dealings with customers. In addition, they confront the language problem on a world scale and therefore have a broader world view of the multilateral aspects than the SMEs. Because of their world concerns, English is their most common choice. Interestingly, though, the multinationals’ other target languages for future learning besides English also reflect their broader international concerns than the SMEs’. Spanish, Chinese and Arabic figure far more heavily than for the SMEs. French is of lesser interest and German and Italian disappear (Hagen et al., 2006, p. 45). (The same shift of orientation towards Asia, Latin America and the Middle East in language learning emerged in earlier questionnaires of MNEs.) Notwithstanding, only 29% of the language deficiency that the MNEs wish to repair concerns English, close to the 26% figure for the SMEs.

5 LANGUAGE LEARNING

Suppose we turn next to language learning and consider the subject in the same spirit as trade, that is, first in terms of learning foreign languages in general and later in terms of the difference it makes if the language is English. Economists possess a theoretical model for dealing with language learning that stems from Selten and Pool (1991) and Church and King (1993) and that was recently extended by Gabszewicz, Ginsburgh, and Weber (2011; GGW). The model uses game-theoretical reasoning but can serve to study the learning decision independently of such reasoning. It admits three fundamental influences on the learning decision: (i) total world speakers of the target language; (ii) total world speakers of the home-country language; and (iii) the cost of learning. A recent econometric study by Ginsburgh et al. (2017; GMT) extends this model beyond GGW by admitting trade with speakers of the target language as a fourth influence on learning. The trade variable has a broad interpretation in this study. It stands for all bilateral relationships between countries, not only commercial ties but also geographical distance, common borders, political agreements and common history, for example, a shared colonial past. These other variables are all highly correlated with trade. (There are problems introducing them separately and some robustness tests in the study relate to their separate influences.) In addition, since trade largely absorbs the commercial inducements to learn the target language, in its presence, the total number of world speakers of the target language reflects essentially the other inducements to learn it, for example, ease of social interaction with people from different cultures and benefits of access to their cultures and their literary and artistic heritages. GMT also introduce linguistic distance (or the inverse of the previous measure of linguistic proximity) and the literacy rate as indicators of the cost of learning. All five of the expected signs in the model are intuitive:

  1. a larger world population of speakers of a target language should make the language more attractive to learn;
  2. larger trade with speakers of this language should do the same;
  3. a larger world population of speakers of the home language should reduce the incentive to learn the foreign one;
  4. linguistic distance should tend to raise the cost of learning the language; and
  5. literacy should help to learn it.

GMT (2016) apply this model to the learning of 13 important world languages over 193 countries. The 13 languages are, in alphabetical order, Arabic, Chinese, Dutch, English, French, German, Italian, Japanese, Malay, Portuguese, Russian, Spanish and Turkish. These languages are important in one of two senses: sheer size or international spread. Japanese is a large language that is not widely spread. Malay and Dutch are examples of languages that are widely spread but not particularly large. Using both criteria rather than either one alone raises the sample variance. The database is the same one as in Melitz and Toubal (2014). The study centres on 2005, although the values of the language variables refer to different years over 2001–07. Importantly, the authors do not study the decision of people to learn the primary language of their country of residence on the ground that those who do not already know this language need to learn it for daily living. There is therefore no concern with the learning of German in Germany or English in the United States. But the study does consider the learning of Turkish in Germany and Spanish in the United States, for example, although there are native-Turkish speakers in Germany and native-Spanish speakers in the United States.

In testing the model, simultaneity is a problem. Learning has a reciprocal influence on three of the preceding five influences upon it: namely the world number of speakers of the target language, the world number of speakers of the home language, and trade with speakers of the target language. As regards the first two of these influences, it is possible to handle the difficulty using native speakers to measure total speakers (since native and total speakers are strongly correlated). But this remedy will not do for the third influence, or trade with speakers of the target language, since learning a language will affect trade with native speakers as well as other speakers of the language. Thus, to estimate the impact of this third influence on learning, the study uses an instrument for trade. The instrument is inspired by Frankel and Romer (1999) and obtains by constructing a value of trade with native speakers of the target language that depends strictly on so-called geographical variables, such as national land area, status as landlocked and population size. Learning of the target language should not affect a value of trade with native speakers depending strictly on those variables. The value is therefore fitting as an instrument; that is, it is so if only we remember that this value also reflects the separate effect of these “geographical” variables since the variables could affect learning independently of trade.

One problematic aspect remains. The database contains 2,125 zeros for differences between spoken and native language, or learned languages, and only 240 positive values. There would be fewer zeros and more positive values if positive values of learning had been recorded below 1% of the population. But 1% is a lower threshold in the database. In the light of this feature, the authors offer separate estimates for the zeros and the positive values. In the case of the zeros, they substitute values of 1 for all the positive values in the database and merely study the decision to learn (a value of 1) or not to learn (a value of 0). In the other case, where they focus strictly on the 240 observations of learning and keep those observations as they are (as positive percentages), the authors effectively study the percentage of people who learn a foreign language conditional on the presence of some positive learning. In the first case, they apply probit, while in the second case, they apply ordinary least squares. In both cases, they instrument for trade. Effectively, therefore, they use probit with instrumentation in the first case and two-stage least squares in the other.

Table 4 summarises the main results. The probit estimates are the marginal effects evaluated at the sample means of the variables. The results look quite favourable on the whole. In the absence of any correction for the endogeneity of trade, all five explanatory variables come out significantly with the right signs (columns (1) and (4)) except for literacy in the OLS result (column (4)). The results after correction for the endogeneity of trade are obviously more important. In this case, columns (2) and (5) show that in the first-stage estimates (for the correction), the output instrument is strong. After the correction (columns (3) and (6)), the results remain confirmatory with the exception of the size of world speakers of the target language in the positive-sample estimate (column (6)). Literacy, which had been insignificant before (in column (4)), is still insignificant in this last estimate. Generally, trade is the most powerful influence on learning in the results. The only problems that arise concern the world population of the target language and literacy in the positive-sample estimate.

Table 4. Foreign language learning
Full sample Positive sample
Probit IV probit OLS IV 2SLS
First stage Second stage First stage Second stage
(1) (2) (3) (4) (5) (6)
Speakers of acquired languages (log) 0.014 0.005 0.002 0.024 0.021 0.006
(4.348) (3.287) (2.701) (1.841) (2.778) (0.286)
Speakers of native languages (log) −0.015 −0.001 −0.003 −0.024 0.002 −0.025
(−3.992) (−1.681) (−4.015) (−4.412) (0.581) (−4.041)
Trade with acquired language countries (0–1) 0.465 0.132 0.788 1.405
(9.243) (3.193) (4.688) (3.040)
Distance between native and acquired language (0–1) −0.317 −0.053 −0.048 −0.355 −0.002 −0.330
(−6.966) (−3.032) (−5.555) (−2.197) (−0.035) (−2.139)
Literacy rate in learning countries 0.249 −0.009 0.030 0.064 −0.125 0.137
(5.323) (−1.249) (4.029) (0.570) (−1.845) (0.999)
Instrument (GDP ratio) 0.585 0.440
(7.661) (2.989)
No. of observations 2,365 2,365 2,365 240 240 240
(Pseudo) R2 .234 .139 .236 .154 .150
No. of countries 193 193 193 94 94 94
  • Notes: Student ts in parentheses. These are based on robust standard errors clustered at country level. ***< .01, **< .05, *< .1. Intercepts are not reported.
  • Source: Ginsburgh et al. (2017).

How does the model fare for English? More specifically, do the results call for dealing with English separately as a special case? The best way to answer this question would be to introduce as many of the languages as possible simultaneously or at least a large number of them and to see if English performs differently than the rest. In such tests, English is always insignificant and so are the rest of the languages. Therefore, Table 5 displays a different set of results. The table shows the errors in the estimates in columns (3) and (6) of Table 4 after correcting for endogenous trade. More specifically, it shows the means and the standard errors (as well as the t-statistics) of the residuals for both sets of results, language by language. This gives an idea of the direction of the errors and how statistically significant they are.

Table 5. Residuals of IV regressions by language
Language Full sample Positive sample
Mean SD t Stat Mean SD t Stat
Arabic −0.186 0.632 −0.295 0.036 0.218 0.166
Chinese −0.240 0.445 −0.540 −0.223 0.033 −6.806
Dutch −0.182 0.351 −0.518 −0.106 0.168 −0.628
English −0.572 1.475 −0.388 0.078 0.319 0.245
French 0.033 0.805 0.041 0.032 0.176 0.182
German −0.046 0.647 −0.071 −0.040 0.130 −0.306
Italian −0.038 0.651 −0.058 −0.070 0.105 −0.668
Japanese −0.195 0.486 −0.402
Malay −0.061 0.260 −0.236 0.288 0.299 0.964
Portuguese −0.172 0.267 −0.643 −0.228
Russian 0.083 0.595 0.139 0.034 0.206 0.165
Spanish −0.184 1.148 −0.160 −0.096 0.099 −0.972
Turkish −0.109 0.325 −0.336 −0.047 0.065 −0.733

Notes:

  • a Estimates of the positive sample are based on Pearson residuals from the probit regression in column (3) of Table 4 and those of the positive sample are based on the IV regression in column (6) of Table 4.
  • b Japanese is not acquired. Portuguese is acquired only in Spain (no standard deviation).
  • Source: Ginsburgh et al. (2017).

The general impression from Table 5, in conformity with the broad evidence with the use of controls for individual languages (mentioned above), is that the model performs little differently for the separate languages. Perhaps the model performs worst for English if we judge from the mean error of −0.572 for this language in the probit estimate, which is the highest in the sample in absolute terms. This error is also negative (underprediction), which can be interpreted to mean that English is a world lingua franca, since there is more learning of the language than the model predicts in-sample. However, the standard deviation for English in this estimate is also the highest and it denotes a significant percentage of cases of predicted positive learning when there is none (the t-statistic is low, 0.39). Furthermore, in the positive-sample estimate, the mean error for English is overpredicted and not particularly distinguishable from the rest (six of which are underpredicted, not counting Chinese). This goes entirely against the idea of status as a lingua franca. In the final analysis, the study says that learning English is subject to the same principles as learning other languages. If so, it is wrong to try to assess the future of English in isolation, without allowing for the similar incentives to learn other major languages.

What about the future of English? According to the model, the evolution of trade will have a profound effect. But the effects of trade are notably symmetric. Growth in Chinese/English trade should promote the learning of Chinese in native-English countries just as it should promote the learning of English in native-Chinese countries. Whether it will raise the importance of English relative to Chinese in the world will therefore depend heavily on the evolution of the share of trade with English speakers on the Chinese side relative to the evolution of the share of trade with Chinese speakers on the English side. That is what the econometric model suggests. The influence of demographic changes is simpler to analyse. Suppose, for example, that the Arabic and Spanish-speaking populations grow fast while numbers in the rest of the world remain constant. Then, the Arabic and Spanish-speaking populations will wish to learn fewer foreign languages while speakers of other languages will not wish to learn either more or less Arabic or Spanish. Thus, Arabic and Spanish will become relatively more important. In theory, of course, these same demographic assumptions should mean more learning of Arabic and Spanish in absolute terms, which would simply reinforce the rise in the relative size of those two languages. According to the results of the tests of the GMT model, however, this reinforcing effect depends entirely on a rise in the share of Arabic and Spanish trade in non-Arabic and non-Spanish-speaking countries and therefore may not materialise. But in any event, the basic demographic assumptions do not favour English. These remarks lend general support to Graddol (1997, 2006) who predicts a significant growth of Chinese, Arabic and Spanish in the future based on trade and demographic trends. In close connection, he also questions how much further the spread of English relative to other languages can be expected to go.

6 WELFARE IMPLICATIONS

There are areas of international encounter where the welfare benefits of a lingua franca are evident. For example, a lingua franca is of mutual benefit to everyone in travel and inside the world industry of news diffusion. The spread of multinational organisations promotes a lingua franca too. The political clout of the United States may be important in explaining the convergence on English as the preferred choice of lingua franca in these cases. But by and large, no other choice would bring the world population as much benefit. True, as Van Parijs (2011) stresses, from a welfare perspective, the distribution of the gains raises issues. Those who learn English as their native language obtain a certain rent. On the other hand, they are also more likely to be unilingual than hundreds of millions of other people. Consider the child born in a small Eastern European country who, by virtue of circumstance, by the age of 15 is fluent in Russian and German, two major world languages, as well as the home language, and perhaps a fourth one because of a monolingual grandparent. From the language perspective, perhaps this sort of person should be deemed especially lucky. If so, one could easily argue that native-English speakers suffer a liability—a winner's curse. The adoption of their native language as the primary lingua franca in the world only worsens the curse: it reduces their incentives to learn a second language still further. The British Chambers of Commerce (2003) and some other British organisations, like the Nuffield Foundation (see the Nuffield Report, 2000), can be understood to lean towards such a view since they deplore the lower possession of foreign languages and the lower attention to foreign languages in school curricula in the UK than most elsewhere in the EU.

It is indisputable, though, that the advantages of native-English speakers receive the most attention. True, if English is adopted as the lingua franca in an international assembly or conference room, native speakers of this language are typically in a better position than those for whom English is an inferior choice. In this respect, considerations of fairness will occasionally suggest improvements in the operating rules of organisations. Pool (1991), for example, shows that it is possible to reconcile efficiency and fairness in the distribution of the costs of translation in international organisations. There could also be derogations to the use of English (or any lingua franca, for that matter) based on axioms of fairness as a function of the circumstances (e.g., who the principal interested parties are).

What about the normative aspects of the dominance of English in science and the academic world? In 2006 a group organised by the American Council of Learned Societies (ACLS) to report on the translation of social science texts issued “A plea for social scientists to write in their own languages” at the annual meeting of the American Council of Learned Societies (see Tymowski, 2006 and also Allen, 2007). The group sees no harm if mathematicians and natural scientists worldwide write in English, but does so if social scientists do the same. The idea, therefore, is that the content of social science investigation can only be fully stated in the investigator's language. If true this would mean that the broad switch to English in social science investigation is indeed a collective drawback and, furthermore, quite unfortunately, that social sciences depend heavily on polyglots and the skills of translators. Yet the only evidence is a reference to the early nineteenth-century writing of Humboldt about the relation of language to empirical perception. In embracing the same position, Hamel (2006) refers to the earlier and related philosophical writing of Herder. Many of us would prefer better evidence that the social sciences are more reliant on the diversity of languages in the world in communicating their work than the natural sciences and that the strong personal inducements social scientists face to express themselves in English impoverish their contributions to knowledge. In a communication to the same Congress to which Truchot (1996) addressed his paper on the (then-current) political pressures facing French natural scientists to write in French, Walter (1996) notably points out that the tendency of scientists to veer towards a single language in writing traces back more than two millennia with the language of choice shifting from Sumerian to Greek to Arabic to Latin. From this perspective, the turn to English in the course of the twentieth century was a return to a longer term trend and the nineteenth-century condition where German, English and French shared the top spot as the preferred languages in science was an aberration. Perhaps the return to the longer time trend is for the better: information travels faster.

This is not to deny the possible uneven impact of the central role of English on the welfare of scientists and scholars in the humanities. How far do native-English scientists and scholars in the humanities have an advantage over the rest (despite their smaller incentives to learn a second language, to their possible regret outside of their professional life and maybe inside of it too)? A study by Sandelin and Sarafoglou (2004) has some indirect bearing on the subject. The authors examine whether native-English countries (namely Australia, Canada, Ireland, New Zealand, the UK and the United States in their study) display a larger number of papers in professional publications by resident scientists or scholars per habitant than other countries in a sample of 30 countries, and the authors do so separately for social sciences, arts and the humanities and the natural sciences. They also control for world size of the principal home language (total world native-Spanish population in the case of Spain, for example) and GDP. The results show that the native-English countries do indeed host a larger number of papers in professional publications by resident scientists and scholars per capita (i.e., in relation to the total national population) than other countries. Interestingly, however, this is true for publications in the natural sciences as well as the social sciences and the arts and the humanities. The coefficient for the natural sciences is around a third as high as for the other two areas, but it is still statistically significantly different from zero at the 95% confidence level. On this evidence, the native-English status of a country favours a higher ratio of home-produced publications to the national population in the natural sciences as well as in the social sciences (quite apart from the world population of native speakers of the home language), if only to a lesser extent. This would suggest, without implying it, that the difference in the welfare advantage of the prominence of English to a native-English scientist or scholar in the three different spheres is only a matter of degree. But I know no study bearing directly on the question.

Let us next return to the areas where multilingualism, interpreters and translations are able to overcome the problem of Babel and English is simply “the first among equals,” so to speak. The argument in these pages has been that these areas are extremely wide and cover the press, television, the Internet, publishing (including translation where defending the theme raised some difficulties) and trade, with three exceptions to which I will return. In these areas, the evidence would show that, broadly speaking, while English is in the lead, it faces sharp limits and all other major languages are safe. The evidence concerning language learning is supportive. If English really threatened to marginalise other major languages in these broad areas of life, we would expect that a model of language learning that treated eight or so of the next ten largest languages exactly the same way as English would perform rather badly. Yet the model does fine. Does any general welfare problem arise from the eminent position of English in these cases? It would seem that the answer is negative. From the standpoint of welfare, there is no clear basis for advocating any special international effort to promote or demote English in the press, television, the Internet, publishing and trade.

In defence of this view, it is possible to argue that it might perhaps be wise to encourage foreign-language learning in general in the world to promote wages, trade and/or culture. But the place of English relative to other languages is a separate topic. Learning English comes at the expense of learning other languages. For example, the hypothetical argument (which does not seem exactly alien) that everyone should learn English as a second language is quite dubious. The argument would imply that some of the time that is currently spent on learning other second languages by people who do not know English would be better spent on learning English. Yet no one has yet made the case. An appeal to coordination failures would not suffice since such failures will often argue for promoting other languages as there are at least half a dozen of them that are internationally as widely spoken as English in large regions of the world. In addition, the principle of diminishing returns interferes. Even if English is now the most useful language to learn in one's homeland and occupation, once enough others learn it, the best language to acquire may well become another. Ginsburgh and Prieto (2011) offer labour market support. Interpreters and translations should also be kept in mind. Translation renders much of learning inefficiently costly (mercifully so), which may also argue against learning English (as well as other second languages).

The cultural areas of English supremacy provide, in my opinion, the only promising ground for the thesis of too much English. Suppose home preference worldwide in the sense that people everywhere, including those with a second language or a smattering of them, prefer to function in their own language in their private lives. The evidence is overwhelming: it comes from the newspapers people read (even if they know a second language), the television they watch, the Internet they use, and many of the facts about edition and trade. Interestingly too, the supporting evidence covers the film and the best-seller as well if only we remember that non-native-English people mostly view English-language films that are dubbed or, in the case of the majority only when there is no choice, subtitled, and they mostly read foreign-language best-sellers in translation. How can it then be welfare-improving for English to dominate the song, the film and the best-seller so much as it does?

The argument that it is welfare-improving must be that, somehow, foreign-language speakers obtain compensation for their sacrifice of home-language benefits either through higher quality in other regards or else through lower social costs. As one possibility, the native-English countries are simply better at producing the relevant goods. As another, benefits of market specialisation are at work. Because of a mix of comparative advantage and economies of scale, all countries do best to narrow the range of their production activities. As one manifestation, native-English countries specialise in producing the relevant cultural products. In other words, just as Germany specialises in the high-quality end of many consumer durables or France does in some niches in jewellery and perfumes, the native-English countries occupy this particular turf. The argument has merit with respect to film where, in fact, the dominance is specifically American as such, and the British film industry is just as dominated by Hollywood as the French one. Hollywood also rose to world prominence under the silent film, before “talkies.” Thus, one can even make a case that Hollywood is the fundamental factor, not English, although some of us will remain sceptical because of the contribution of the French, German, Italian, Japanese, Russian, Spanish and Swedish film in the history of the cinema (see also François & van Ypersele, 2002). But in the case of the song and the best-seller, the argument seems wobbly. Where is the benefit in quality or the reduction in social costs? One particular ground for suspicion is that in other related cultural areas, like music without words, photography, painting, sculpture and architecture, the native-English contribution is closer to what we would expect on the basis of relative population size, income, wealth and tradition. There is something special going on in the cultural market where language is concerned.

I would argue that there is a possible long-term danger. Suppose that the budding author and popular singer of talent feel under the same pressure as the scientist and scholar in the humanities to make his/her name in English. Assume also, as seems to be true in science, that the outstanding talents are especially likely to succumb to this temptation. Their odds of success are higher but the lure is at least the same. In fact, we know of several cases of authors in the hall of literary fame who dropped their native language in favour of others with a larger and more impressive literary tradition (never any opposite examples to my knowledge since the move away from Latin and the printing press): Conrad, Kafka and Ionesco, to mention three. Gogol evidently seriously hesitated between his native Ukrainian and Russian. This is crucially relevant. The threat is not that creative writing will generally dry up in other languages but that the best work will be done in English (a tendency which, if already under way, would help to resolve the paradox I underlined before). In the case of science, where such a tendency is manifest, there is no harm and indeed even a benefit. In literature, however, we cannot reason the same. Literature is an area where language as such is a source of pleasures and enjoyment, an end in itself, rather than a mere instrument for conveying information, or as an outstanding economic contribution by Church and King (1993) has it, a “communication technology.” True, under the previous scenario, English might continue to evolve and to discover new literary veins. But how much comfort is that? If everyone wrote music for the violin, likewise music might still thrive and we might even hear new echoes of the viola and the guitar and who knows what else issuing from violin strings. Still, what about the keyboard, the winds and the percussion? It does not seem plausible to argue that there would be no loss.

Notes

  • 1 Ostler (2010) does the same. For further detail about the MT data, see Melitz and Toubal (2014). Their data by individual country are available on Toubal's website.
  • 2 These statistics come from a wide variety of sources and concern the actual usage of languages on the net, not merely the predominant languages in the location of the users.
  • 3 One source of a possible discrepancy between the last two columns, which is difficult to assess, is that the estimates of Internet users assign one language to each user, while the ratios in the last column rest on figures that admit bilingualism so that any speaker may be counted two or more times.
  • 4 The 30% figure for 2009 calls for elucidation since the information from the UNESCO yearbooks started shrinking in the nineties and this source no longer permits calculating this ratio for any date later than 1991. Happily, though, Wikipedia currently compiles a table of “books published per country per year” based on remaining UNESCO data and an assortment of other well-annotated sources. According to the Wikipedia table, publications by the United States and the UK alone bring up the total for titles in English to 24% (as it is possible to infer because these two countries publish little in foreign languages). The main difficulty in assessing how much above 24% the ratio is comes from the absence of the information that UNESCO used to provide about titles in English published by non-native-English countries. In reaching the 30% figure, I allowed for the publications in English in the Wikipedia table by India, Pakistan, Hong Kong, Canada, Australia, South Africa and New Zealand besides the United States and the UK. This brought the total up to around 27.4%. I then made an allowance of 2.6% more for published titles in English by the rest. I believe this 2.6% estimate to be a reasonable one since at the time when detailed information was available in the UNESCO Yearbooks, 3.6% of the publications of the rest were in English around 1971, 2.1% around 1981 and 2.6% around 1991.
  • 5 The trend continued beyond 1987. Table 1 in Sapiro (2010), based on the UNESCO online source Index Translationum, shows a rise in the ratio of English in translations from 45% in 1980–89 to 59% in 1990–99.
  • 6 It is also clear from Sapiro (2010) that French publishers, at least, are particularly concerned with translations of literary works from other source languages besides English.
  • 7 Of course, on average home sales of home-language works that are translated are many times larger than sales of other works that sell at home, and this is indeed likely to lead to larger world sales of translated works that are originally written in the language of the large community. But this raises a distinct consideration which, by itself, has no hope of resolving the puzzle. Translated works that are originally written in the other large languages of the world besides English also sell many more times at home than the rest, and the disproportional sales of translations originally written in English resulting from this factor are unlikely to come even close to resembling the data.
  • 8 For some interesting discussion of the varied motives for choosing an official language or a number of them when no choice is obvious, see De Swaan (2001).
  • 9 The measures of spoken and common native language concern the probability that two people at random for a country pair will share the same spoken language or the same native language, respectively, as the case may be. The measure of linguistic proximity refers instead to similarities of a limited list of words with identical meanings based on expert judgements, where these judgements come from the Automated Similarity Judgment Program, an international project by ethnolinguists and ethnostatisticians (see Bakker et al., 2009; and Brown, Holman, Wichmann, & Velupillai, 2008). The measure of common official language is the usual binary one.
  • 10 This is similar to the result in an earlier study of similar inspiration based on poorer data by Melitz (2008), which had used only two measures of a common language, one of them mostly based on official status and the other strictly on spoken language.
  • 11 This is, of course, the same point that Anderson and van Wincoop (2003) make in explaining why national trade barriers form a far weaker incentive for bilateral trade between two US states than between two Canadian provinces.
  • 12 Of course, for that very reason, people in the Portuguese-speaking country would have stronger incentives to become multilingual. This diminishes the weight of the point without denying it altogether.
  • 13 On the other hand, the result for linguistic proximity is sensible: namely linguistic proximity between languages, regardless of the pair of languages, boosts trade with a positive correction for linguistic proximity to English. The strong results for linguistic proximity for English in Table 3 (for which I have no basic explanation especially in the light of the rest of the results) find an echo in a study by Ku and Zussman (2010) of the impact of a common language on bilateral trade that focuses on English alone. These authors implicitly rest the impact of common English on bilateral trade strictly on linguistic proximity since they rely on bilateral differences of scores in different countries on TOEFL, the Princeton-administered Test of English as a Foreign Language. These international differences in test scores will be correlated with difficulties of learning English for people with different native languages, and it is difficult to ascribe the effect of the scores on bilateral trade to anything else.
  • 14 If we multiply aggregate trade by the percentage of native-English speakers, country by country, sum up the products over all countries and divide by the sum of trade over all countries, we get around 23%. This measure means double-counting, but since the double-counting is both in the numerator and in the denominator, it cancels out.
  • 15 Greater precision might also have been questionable.
  • 16 The quantitative impact of trade is also considerable. Suppose a doubling of the trade share with speakers of the destination language (a 100 percentage point increase). According to the probit result, there is a 13.2 rise in the probability of learning. In the case of the positive-sample result, which is conditional on some positive learning, a single percentage point rise in the trade share of the destination language will increase learning of the language by 1.4 percentage points.
  • 17 If instead one simply introduces a separate term for the world level of English speakers when English is the destination language (or else a dummy variable with a value of 1 when English is this particular language and zero otherwise), English turns up with a significant coefficient in one test or the other, the full-sample or the positive-value one. But the same is true for most of the other languages in similar tests. The right tests to perform are those I mention where as many languages as possible enter simultaneously.
  • 18 The large t-statistic of the mean error for Chinese in the positive-sample estimate is arresting but essentially misleading. This result depends strictly on Malaysia and Singapore, the only two countries where there is any learning of Chinese outside of China (at the 1% threshold in the study). Thus, the standard deviation rests on these two observations alone. The mean of the error for Chinese in the probit estimate, which depends on all the observations, is actually about the same as in the positive-value sample but with a much higher standard deviation.
  • 19 However, the result is entirely consistent with the evidence of English's status as such in some limited areas such as air traffic control, scientific writing and international sports.
  • 20 Of course, a spurt of teaching of English in school is well under way in China (see Yong & Campbell, 1995), whereas the teaching of Chinese in English-speaking countries remains retarded today. It would indeed be helpful to introduce school curricula in foreign languages in the model (with the appropriate lag) if it could be done (if the data were widely enough available). However, it is not a foregone conclusion that major revision would follow: instruction in a foreign language as a child need not mean ability to converse in the language in adult life. The factors present in the model may still be the critical ones.
  • 21 See also the Dearing report to the secretary of state for languages and skills in the UK (Dearing, 2006).
  • 22 This is distinctly not the authors’ emphasis.
  • 23 Of course, thousands of minor languages are endangered. See Dalby (2002), Hagège (2009) and Diamond (2012, chapter 10). However, this is a separate point in my view, quite distinct from the spread of English and its implications.
    • The full text of this article hosted at iucr.org is unavailable due to technical difficulties.