Presidential State of the Union addresses are a format we all think we know pretty well. Every year, around the beginning of the new session of Congress, the president addresses both houses of Congress, laying out an agenda, speaking to the whole country, and well, describing what state the Union is in. It is a highly ritualized, formal process, and it’s hard to imagine it working any other way.
However, State of the Union addresses, or more accurately State of the Union messages, have not always been spoken. George Washington and John Adams both delivered oral addresses to Congress, but from Thomas Jefferson’s first year in office in 1801 until the end of William Howard Taft’s term in 1912, the State of the Union was a written message delivered to Congress, not a speech. Woodrow Wilson brought back the spoken address in 1913, and most modern presidents have followed his lead. The last written State of the Union was in 1981.[1]
While State of the Union addresses have been discussed and dissected ad nauseum, digital tools give us a unique opportunity to examine the entire corpus of these speeches and messages for stylistic and content differences. Based on an initial scan of the historiography, it looks like some digital analysis of these messages has been done, but none have focused on the differences between spoken addresses and written messages. My proposal is to take up that analysis and examine how the difference in form might have impacted the content of the message. Voyant Tools provides a number of useful options for this type of analysis, from overall word frequency to length to total number of words to correlations between individual words. The ready availability of the text of each of these messages should make it relatively easy to create a corpus for Voyant Tools to analyze.
Obviously, variation in language patterns across 200 years of history are going to have a significant impact on this analysis, as are major world events. For that reason, I’m planning to focus my analysis most heavily on the transitional era between 1913 and 1981, when there was some variation year-to-year in whether the president gave a spoken address or delivered a written message. This will allow me to compare the stylistic differences between written and spoken States of the Union within a single presidency, which should hopefully control for some of the other complicating factors.
[1] Gerhard Peters, “State of the Union Addresses and Messages,” The American Presidency Project, http://www.presidency.ucsb.edu/sou.php.
Computational analysis of the State of the Union speeches is a great idea. It’s a corpus of texts that is relatively straightforward to source, and the regularity and consistency of purpose for them makes them an interesting index to explore any number of issues.
The idea of comparing the ones that were intended to be spoken with the written messages is an interesting idea. Given that this directly splits the addresses into two groups, it would be relatively straightforward to make these comparisons. It would be interesting to try these in Voyant, but beyond that it may then be interesting to try any number of different tools for doing computational analysis fo texts.
All in all it seems like an interesting and nicely specific project.