The works of Shakespeare have been performed, studied, and marveled over for long enough that you might think it impossible for scholars to find anything new to say about them. “Shakespeare by the Numbers: On the Linguistic Texture of the Late Plays” however, is proof that, at worst, modern software applications give scholars new ways to validate old ideas, and at best, give scholars entirely new ways to generate meaning from well-known sources.
In this article, Michael Witmore and Jonathan Hope describe how the textual analysis software “Docuscope” was used to test the theory long-held by Shakespeare scholars that the Bard’s late plays are in some way stylistically different from his earlier tragedies, comedies and histories-constituting a coherent genre unto themselves. To quote Witmore and Hope, this type of analysis “calls attention to a heretofore invisible set of dramaturgical strategies at work in the late plays, strategies that mobilize language so consistently and on such a pervasive verbal level that their effects have gone unnoticed by more traditional genre criticism.”
Docuscope works by first categorizing word sequences in the text of Shakespeare’s plays into distinct groupings that begin with three broad clusters, narrow to families, and finally to distinct and specific “Language Action Types”. Docuscope then compares the relative frequency of these word-pattern usages to their frequencies in other genres, providing insight into the specific language choices Shakespeare made that differentiate genres. Using this tool, Witmore and Hope are able to perceive patterns of evidence that are nearly impossible to notice without assistance.
Docuscope helped Witmore and Hope discover that the late plays do indeed share distinct language choices that also differentiate them from earlier tragedies, comedies and histories. More meaningfully, the features that Docuscope highlighted provide insight into what thematic choices Shakespeare focused on in the late plays; as a group “they make way for inner life and revelation through memory and recognition …they subordinate the declaration of actions present and past to the stillness of judgment.”
Never fear, if Shakespeare or scholarly articles aren’t your bag, the radio show/podcast RadioLab featured a similar type of inquiry in May, 2010. In the short titled: “Vanishing Words” Dr. Ian Lancashire describes his computer-based textual analysis of Agatha Christie’s works. In a startling conclusion, Lancashire provides evidence that Christie was suffering from Alzheimer’s later in life, as her 73rd book displays a loss of a fifth of her vocabulary, along with other clues. This story runs from 2:11-8:28, but the episode goes on to discuss the possibility of recognizing these diagnostic clues much earlier in life; it’s extremely interesting, I highly recommend it, anyway…
Here’s the punchline: “Vanishing Words” and “Shakespeare by the Numbers” share a promise of possibility for scholars. Text-analysis software, when programmed to answer historical questions, can uncover hidden meaning in long-studied sources that goes beyond the comprehension ability of a single mind. As Witmore and Hope put it: “Docuscope may prove instructive to future scholars who want to understand the usefulness of ‘counting things’ in humanistic inquiry- quantity being perhaps one of the last concepts in the humanities which has not come in for rigorous theorization.”
Furthermore, this type of software can augment the capabilities of scholars in an age that will shortly sorely need it, as the same technological capabilities that make it possible to search a corpus for meaning are also allowing for the creation of ever-widening corpuses in a digital age.