Difference between revisions of "Number of words in English"
|Line 43:||Line 43:|
=== Binomials ===
=== Binomials ===
difference between learning single words – ''big'' and ''small'' – and [[binomial]] expressions like ''black and white'', ''thick and thin'', ''boys and girls'', ''ladies and gentlemen'', ''eggs and bacon'', ''fish and chips'', ''socks and shoes''?
== Solutions ==
== Solutions ==
Revision as of 09:05, 10 September 2009
- 1 Counting "words"
- 2 Solutions
- 3 See also
Different studies use differing criteria when counting the number of "words", lexemes or vocabulary items in a language. Depending of the criteria uses estimates for English may vary between 500,000 and 2 million words. We identify below some of the many criteria which one would need to consider.
Should we count species names for flowers and insects and the 500,000 different names for fungi which are common to all languages? What about names for chemicals? How about medical names for diseases? With these you can dwarf the number of "normal" words in any language.
Status of a word
Equally difficult is the question of whether a word is actually used - it may exist but be so obsolete that it isn't used any more. Do we count it or not? Do we count slang? Do we count regional words? Do we count a word if it is used in the UK but not in the US or in all international varieties of English (including Indian English, which has a large selection of words from native languages.)
There are a vast number of acronyms in the language some, such as UNESCO and NATO, are know internationally. Others such as TEFL or CELTA are only used by small communities. How would one decide whether to count them or not?
If a word has two spellings, does that count as one word or two? Or two past participles like "lighted" and "lit" or "dived" and "dove"? Does "dove" as a bird count as a separate word?
Furthermore, given that over eighty per cent of all words in English have more than one meaning – water as a verb and noun; lock as a verb and noun related to keys, or as a construction on a canal or river to regulate the ascent or descent of boats, or as a hold in wrestling or judo, or as in a lock of hair – should one count each meaning of the same word – the same combination of letters – as a different item? Surely if a person knows five meanings of the same word, he or she has a more extensive vocabulary than another person who knows only one meaning?
Get and phrasal verbs
Phrasal verbs are verbs formed by two (or more) parts. They express a single concept such as "run away" or "wake up" should the be counted as a single word?
Take one of the most frequently used verbs in English – get. Should we consider the phrasal verbs get at, get away, get back, get by, get in, get off, get on, get over, get through, get up and many more as a word?
Then there are get forms where "get means "become" such as get fat or get fatter. Should these be one lexeme -– get -– or an expression, a set phrase, an idiom? In a dictionary, these, and many others, might all be included under the entry get. And what about the inflections: gets, got/gotten, getting?
Prefixes, suffixes and inflections
How should we count words beginning with prefixes such as un-, as in unhappy, untidy, unlikely, many of which are not included in dictionaries because of their apparent obviousness? The same occurs with adverbs ending in -ly, or inflections of nouns (singular and plural), adjectives (comparison) and, as we saw above, the past tenses of most verbs unless they are so irregular as to cause possible confusion. Thus, bad, worse and [the] worst would probably be included as three separate entries, whereas in the case of more regular adjectives such as cold, its regular comparative and superlative – colder, [the] coldest – would probably be included under one single entry: cold.
One might also ask if there is a difference between learning single words – big and small – and binomial expressions like black and white, thick and thin, boys and girls, ladies and gentlemen, eggs and bacon, fish and chips, socks and shoes?
Given the above, is there any way that we can talk about useful numbers?
Counting the number of words we use
One solution might to try to estimate the vocabulary of the average native speaker, but even this presents difficulties. Partly because we all have an active and a passive vocabulary and partly because we can often "know" words we have never seen before, either because of their context or because they are made up of other parts of words we already know.
Counting the words in a dictionary
One might imagine that simply counting the words in a dictionary would provide the answer but dictionary makes have to consider all the issues outlined above. then the question is, "which dictionary"? A medium-sized dictionary may contain some 100,000 entries. The New Oxford Dictionary of English, published in 1998, is the biggest single-volume dictionary and contains 350,000 words, of which 52,000 are scientific and technical words, although it avoids over-technical terminology. On the other hand, the 20-volume OED, the definitive dictionary of the English language, contains over half a million lexemes many of which are obsolete.
Counting the words in other dictionaries would give other results depending on the objectives of the dictionary creator.