Project #2413 on iSENSEProject.org
This data set is a small subset of word frequency data from Brigham Young University's massive Corpus of Contemporary American English, which contains 180,000 contemporary texts totaling some 520,000,000 words:
http://www.wordfrequency.info/
This list includes the 1,000 most frequently occurring words in the corpus, each with its frequency rank, absolute frequency, and part of speech.
Some interesting highlights...
Most frequent word, overall: "the" (1)
Top verb: "be" (2)
Top noun: "time" (52)
Top adjective: "other" (75)
Name | Units | Type of Data |
---|---|---|
Rank
|
None
|
Number
|
Frequency
|
None
|
Number
|
Dispersion
|
None
|
Number
|
Word
|
None
|
Text
|
Part of Speech
|
None
|
Text
|
Rank | Frequency | Dispersion | Word | Part of Speech |