My last post titled Wordle Statistics was long enough so I didn't want to add more newly found information to that and hence I'm making a fresh post. 3Blue1Brown has just created a YouTube video that involves a statistical analysis of Wordle.
Figure 1 |
CRANE: 4,888,961 FIFTH
TRADE: 110,086,585 FIRST
ERASE: 3,086,642 SIXTH
GRACE: 17,642,126 THIRD
BRAKE: 9,321,885 FOURTH
FRAME: 46,079,991 SECOND
Using Google search with quotes e.g. "trade" yields the following statistics:
CRANE: 166,000,000 SIXTH
TRADE: 1,930,000,000 SECOND
ERASE: 242,000,000 FIFTH
GRACE: 918,000,000 THIRD
BRAKE: 503,000,000 FOURTH
FRAME: 2,350,000,00 FIRST
Interestingly, using the Google search, FRAME and TRADE swap places with the former being markedly more frequent (in searches at least). CRANE and ERASE also swap positions in fifth and sixth places.
I downloaded the CSV file of word frequencies from kaggle (it's only 5MB) and filtered out words that were not five letters in length. Here are the initial five letter words with the highest frequencies:
about 1,226,734,006
other 978,481,319
which 810,514,085
their 782,849,411
there 701,170,205
first 578,161,543
would 572,644,147
these 541,003,982
click 536,746,424
price 501,651,226
state 453,104,133
email 443,949,646
world 431,934,249
music 414,028,837
after 372,948,094
video 365,410,017
where 360,468,339
books 347,710,184
links 339,926,541
years 337,841,309
As can be seen, ABOUT comes out clearly on top with a frequency of over 1.2 billion! This might not be a bad starting word. Anyway, more food for thought went tackling Wordle.
No comments:
Post a Comment