Tuesday, April 2, 2013

General Conference Word Frequencies

With general conference coming up for The Church of Jesus Christ of Latter-Day Saints, I though I'd do some text analysis of past conferences (because that's apparently how I think).  I'll put the code [python] up on github for anyone who wants to replicate the results for themselves - feel free to ask for help if you can't figure out how to get all the pegs to line up.

So, a pretty straightforward question is "What do they talk about the most at General Conference?" And here is the straight answer I got for such a narrowly scoped question: the top 10 words across all sessions of General Conference dating back to April 1971.:


  1. the
  2. of
  3. and
  4. to
  5. in
  6. a
  7. that
  8. i
  9. is
  10. we
Somewhat uninspiring.  Of course we need to have a baseline to compare against, because the question "What do they talk about" needs to have some context: What do they say at General Conference more frequently than is said in the English language?  Several word frequency lists are maintained here and there on the internet (there's even a massive one at this random school.)  Since I'm only looking at the most frequent words, I used this one because it was easy to hack for this project.  One robust way to go about implementing this would be to use an adaptation of TF-IDF vectors - but this is a weeknight hack, so instead I will be content with the following question: of the 1000 most common words in General Conference, which are not in the 1000 most common English words?  I'll rank them in order of their frequency in General Conference.  Here it is:
  1. lord
  2. christ
  3. jesus
  4. world
  5. unto
  6. gospel
  7. priesthood
  8. ye
  9. years
  10. prophet
Apparently, the Church of Jesus Christ of Latter-Day Saints talks about Jesus Christ (whom they call Lord, and do so a lot).  Who knew?

For the curious, the next 40 results are at the end of the post.  Since this analysis was pretty basic, in my next post (which might be tonight if I can't find something better to do), I'll use some topic modeling to get a better grip on the question "What do they talk about at General Conference?"

  1. lives
  2. d
  3. holy
  4. joseph
  5. eternal
  6. words
  7. savior
  8. smith
  9. temple
  10. c
  11. testimony
  12. women
  13. saints
  14. taught
  15. sisters
  16. thou
  17. heaven
  18. pray
  19. mormon
  20. blessings
  21. heavenly
  22. parents
  23. spiritual
  24. mission
  25. brethren
  26. prayer
  27. thy
  28. teach
  29. scriptures
  30. kingdom
  31. amen
  32. commandments
  33. prophets
  34. hearts
  35. families
  36. ghost
  37. latterday
  38. conference
  39. whom
  40. witness