Reading Jockers’ Macroanalysis and seeing how the author analyzed a large corpus of Irish literature in order to draw conclusions about Irish authors, the production of Irish literature, its themes, and other related topics inspired me to consider a similar project with my own particular literary interest area. I refer to mystery fiction. In 2017, I read 50 mystery novels—spanning in publication dates from 1868 to 2017, but centered during the Golden Age of Detective Fiction in the 1920s and 30s—and engaged in some light analysis of the text that Jockers would characterize as “close reading.” While it was an interesting way to draw conclusions about mystery fiction, it was neither an efficient one (as it did take me all year) nor a truly representative one (as I chose to read books I felt I’d like and ignored those I felt I wouldn’t like).
Macroanalysis inspired me to consider other tools for analyzing the world of mystery fiction. My previous analysis blended the quantitative (how many characters died?) with the qualitative (how good was the book, rated on a five-star scale?) to come up with subjective qualitative conclusions. (Seven out of 11 books to which I gave five-star reviews had 0 or 1 deaths; therefore, most great mystery novels are not bloodbaths.)
But with digital macroanalytic tools, wider questions with deeper implications become available for our consideration and study. For instance: What crimes have interested readers throughout time? Within the past few years, we’ve seen two major patterns appear in popular mystery writing: true crime (such as Serial and I’ll Be Gone in the Dark) and psychological thrillers (such as Gone Girl and The Girl on the Train). What patterns have emerged at other periods in history?
This is a huge question that would require a lot of work to answer, even with macroanalytic tools. The two main obstacles to this study are pretty major: first, the literary mystery corpus is unmanageably vast, and second, as Jockers discusses toward the end of his book, copyright laws block us from accessing that corpus easily.
In order to combat the problem of an impracticably large corpus, I propose a more manageable project focusing solely on Agatha Christie’s novels. A remarkably prolific author, Christie published 66 novels under her own name between 1920 and 1976 and achieved an exceptionally wide readership—as the back of all recently published Christie novels will remind you, her novels have been more widely published than any works other than Shakespeare and the Bible. Although Christie cannot hope to represent the entire mystery fiction genre, if one author must be chosen as a representative, she seems the best suited to the task.
The copyright issue is a considerable stumbling block here: with some exceptions, the full text of Christie’s novels has not been digitized. Her first three books are available through Project Gutenberg, and a few others can be read in plain text through the Internet Archive’s Open Library, but owing to the state of copyright laws, the books that are available in these digitized formats are generally Christie’s earliest, which is unhelpful to a project that seeks to track change over time. To bypass this issue, I propose to focus my study on summaries. Although a mystery novel’s summary will naturally not include everything that happens in the book, it will generally identify the crime that drives the story, which is the subject of my inquiry. Synopses are included in ONIX metadata and, especially for constantly reprinted books like Christie’s works, multiple summaries for each book proliferate online on any number of sites, from WorldCat to publisher pages, Goodreads, and more.
After compiling these summaries, I propose to use MALLET’s topic modeling capabilities to examine the trends and patterns that show up in the summaries of Christie’s novels throughout time. Wordle could also be a useful tool, at least for determining which individual words appear most often in summaries. Given the enduring popularity and wide readership enjoyed by Christie’s works, such an analysis will offer a window into the types of mystery and crime stories that have captured the public’s attention over a period of five decades.
As with all other types of media, the books that people read reflect information about their culture and the society they live in. Mystery fiction in particular offers a fascinating look into people’s fears, their concerns about their society, and the threats they perceive in the world around them. My hope is that studying the literary mystery corpus will suggest some insights into the culture of Christie’s readership.
One Reply to “Print project proposal: Shifting subjects in popular mystery fiction”
Looking at subjects in mystery fiction is a really neat idea. Focusing on Agatha Christie is a great way to zero this in more, but to your point, it presents it’s own challenges given that you don’t have access to the full text for the works. Shifting to use synopses is a sharp idea. With that noted, my understanding is that for MALLET to be really useful, you generally want to be working from a lot of text. I’m not sure that you would likely get enough text to work from with that approach.
That said, I think there are a few other concepts that you could look at exploring. One would be to look more at reactions to either a specific set of authors or to their works. For example, you could take the names of the authors themselves and use Google n-gram and the Time Magazine corpus as tools for exploring trends in how the authors are discussed and the terms that show up near them.
Another option could be to try and get some bulk data from goodreads (you can see what kinds of data you can get from goodreads here https://www.goodreads.com/api ). That is, if you wanted to look at how contemporary audiences are engaging with mystery fiction from different periods you could get access to a considerable amount of book reviews and ratings from Goodreads and then do work with that text.