How does digitizing texts impact the way we conduct research? Michael Whitmore and Jonathan Hope believe that a literary criticism revolution is at hand, one in which scholars will discover new patterns and arrive at new conclusions.
Their 2007 article “Shakespeare by the Numbers: On the Linguistic Features of the Late Plays” (from Early Modern Tragicomedy) first notes that the idea that genre is a nebulous concept, one that has changed over time. Qualitative observations alone cannot accurately determine texts’ themes since commentators have different standards will disagree among themselves. How, then, can we create a widely acceptable means of analyzing?
Whitmore and Hope propose that we rely on a “quantitative analysis of linguistic features” (136). Programs such as Docuscope take literature that has been digitized and allow scholars to search for key words and verb tenses. With this raw data, they can more clearly decipher diction and stylistic patterns.
The article examines Shakespeare’s last seven plays, which various commentators since the 1870s had discribed as “romances” or “tragicomedies” (133). Yet the First Folio, published in 1623, did not break them into a distinct group. What elements within these plays caused later critics to see patterns that Shakespeare’s first editors evidently did not?
Whitmore and Hope broke plays into 1,000, 2,500, and 7,500 chunks (to allow for a larger sample size), ran them through Docuscope, and discovered that the later plays had unique linguistic characteristics. 1) Verb Tense: these plays more often used the past tense and referenced the past. 2) Asides: they also had more instances of characters’ speaking to the audience or referencing outside events. 3) Use of “to be”: characters more often used both forms of the verb “to be” and verb tense ending in “-ed.”
What does this raw data suggest? The authors argue that the prevalence of the past tense reveals the past’s importance to the present, the asides enhance the “dreamlike” ambiance of the the plays, and that the “to be” usage shows a preference for telling, rather than showing, the audience about events and people. Thus, Shakespeare used these linguistic features to create “focalised retrospection” (153) and the quantitative analysis reveals specific reasons why the later plays comprise a distinct group.
However, Whitmore and Hope are less aggressive with their general conclusion. They note that such analysis complements, but does not replace, traditional qualitative commentary. The door is wide open, though, for other scholars to use quantitative analysis with myriad other works.
How did you respond to their article? Do you think quantitative analysis of the type they used on Shakespeare’s plays can tell us more about texts and authors’ intentions than we already know? Or are they over-hyping its potential?
Good questions. I think the more works are digitized the easier it will be to use quantitative analysis and then we'll be able to actually see the importance in particular rhetoric. I LOVE Google books, because a lot of them allow for the researcher to search for terms. Some may think that it doesn't matter how works are written and the particular words used, but in my opinion rhetoric is extremely important. In my MA Thesis, I had various terms counted to see their frequency in the Federalist Papers. By running regression analysis, I was able to show how the use of the term "union" had statistical significance over time when compared to words "empire," "nation," "republic," etc. and was clearly used intentionally with greater frequency in the first publications.
What does that tell us? "Union" was a powerful political word in early America that was used as a tool in obtaining support for the Constitution. Perhaps we should look through political rhetoric in the decades following to see how words like "union" continued to be used and where it failed and where it succeeded.
Dennis and Tracie: thanks for your comments! I agree that using quantitative analysis on political rhetoric offers vast potential and, Dennis, your M.A. thesis on the Federalist Papers sounds fantastic. What were your other big conclusions?
I also think that quantitative analysis would be great for newspaper articles. Publishers in the U.S. Revolutionary Era and Early Republic were not from the upper-class; perhaps they used different key words in their newspapers/pamphlets/almanacs than did the political elites who wrote the Federalist Papers (Hamilton, Madison, & Jay). Does anyone know if a large-scale effort to digitize and make widely available newspaper articles has been made (i.e. for free, unlike subscription-based businesses such as ProQuest)?
Great post! I love the title.
So, to quantify "to be," or not to quantify "to be"- that is the question? Sorry, I couldn't resist using a little lame Shakespeare humor.
I have to agree with Dennis, that analysis of word choice and repetition provides the historian with valuable insight into authorial intent and conscious rhetorical design. Quantitative analysis can reveal when certain word choices came into vogue, draw attention to possible changes in social psychology, and allow the historian to qualify the "why" behind the "when." I do, however, believe that this type of linguistic study is most effective with political rhetoric like that mentioned in Dennis' post where documents were/are carefully crafted for greatest political effectiveness.