Although the claim is often made that "English has more words than any other language" it is not that easy to count the number of words in English.

Counting "words"

Different studies use differing criteria when counting the number of "words", lexemes or vocabulary items in a language. Estimates for English vary between 500,000 and 2 million words. A medium-sized dictionary may contain some 100,000 entries. The New Oxford Dictionary of English, published in 1998, is the biggest single-volume dictionary and contains 350,000 words, of which 52,000 are scientific and technical words, although it avoids over-technical terminology. On the other hand, the 20-volume OED, the definitive dictionary of the English language, contains over half a million lexemes.

Would one count conjugations or past participles used as adjectives? Species names for flowers and insects which are common to all languages? Chemical names? With these you can dwarf the number of "normal" words in any language. And the 500,000 different names for fungi...

Equally difficult is the question of whether a word is actually used - it may exist but be so obsolete that it isn't used any more. Do we count it or not? Do we count slang? Do we count regional words? Do we count a word if it is used in the UK but not in the US or in all international varieties of English (including Indian English, which has a large selection of words from native languages.)

Defining words

A constant debate is whether concepts such as facts – the names of people or places and other proper names be considered as forming part of one’s personal lexicon when calculating its size. Undoubtedly, the name, or fact, Shakespeare, is as much a part of the English language as the word literature or drama. And the fact/word London is probably used more often than the word village or town. Thus, given the overlapping of criteria, calculating the size of one’s own vocabulary is complicated and must vary according to many different factors.

Likewise, terms such as UNESCO and NATO, both well-known acronyms, that is, words, on an international level, must undeniably count as being part of an educated person’s vocabulary.

If a word has two spellings, does that count as one word or two? Or two past participles like "lighted" and "lit" or "dived" and "dove"? Does "dove" as a bird count as a separate word?

Furthermore, given that over eighty per cent of all words in English have more than one meaning – water as a verb and noun; lock as a verb and noun related to keys, or as a construction on a canal or river to regulate the ascent or descent of boats, or as a hold in wrestling or judo, or as in a lock of hair – should one count each meaning of the same word – the same combination of letters – as a different item? Surely if a person knows five meanings of the same word, he or she has a more extensive vocabulary than another person who knows only one meaning?

Get and phrasal verbs

Take one of the most frequently used verbs in English – get. Should we consider the phrasal verbs get at, get away, get back, get by, get in, get off, get on, get over, get through, get up and a dozen other uses of get plus one other word, such as get home or get fat or get fatter or even more additions, such as get away with, get rid of, get over something, get your own back on somebody, as one lexeme -– get -– or an expression, a set phrase, an idiom? In a dictionary, these, and many others, might all be included under the entry get. And what about the inflections: gets, got/gotten, getting? Unlike other European languages, Modern English has very few inflections and contrary to what many people think, is surprisingly regular, despite its many exceptions.


Or words beginning with prefixes such as un-, as in unhappy, untidy, unlikely, many of which are not included in dictionaries because of their apparent obviousness. The same occurs with adverbs ending in -ly, or inflections of nouns (singular and plural), adjectives (comparison) and, as we saw above, the past tenses of most verbs unless they are so irregular as to cause possible confusion. Thus, bad, worse and [the] worst would probably be included as three separate entries, whereas in the case of more regular adjectives such as cold, its regular comparative and superlative – colder, [the] coldest – would probably be included under one single entry: cold.


And whilst on the subject of antonyms, what’s the difference between learning single words – big and small – and binomial expressions like black and white, thick and thin, boys and girls, ladies and gentlemen, eggs and bacon, fish and chips, socks and shoes?


One solution might to try to estimate the vocabulary of the average native speaker, but even this presents difficulties. Partly because we all have an active and a passive vocabulary and partly because we can often "know" words we have never seen before, either because of their context or because they are made up of other parts of words we already know.


One of the consequences of this long and varied history is that English spelling no longer corresponds particularly well with English pronunciation, giving rise to calls for spelling reform.

