As I mentioned in my post on Jockers’ Macroanalysis: Digital Methods and Literary History, the methods described in his book have interesting implications for the study of U.S. foreign policy. For example, if one were to study the Eisenhower Administrations’ Middle Eastern policy from his inauguration to his farewell address, the sheer number of diplomatic correspondences alone would be mind-blowing. If one were to just examine the State Departments’ publication of Foreign Relations of the United States (FRUS) relating to the Middle East (excluding Northern Africa with the exception of Egypt and Southeast Asia), you would have to explore thirteen different volumes. Even if a historian could closely read all the documents contained in those volumes, it’s hard to see the forest through the trees.
Computational analysis is one method of seeing that forest. Topic modeling, in particular, allows you to analyze a large corpus of texts and identify words that appear together multiple times in various documents. What sets topic modeling apart from programs like Ngram is that it can be done without knowing ahead of time what topics are the most important. As Cameron Blevins writes in “Topic Modeling Martha Ballad’s Diary,” topic modeling has a lot of potential as historic source material. He concluded that MAchine Learning for LanguaE Toolkit or MALLET did a better job of grouping words than a human reader, in some cases creating word groups that he never would have predicted. This methodology allows a historian to extract patterns that would be missed during a microanalysis of the text.
Using topic modeling to analyze FRUS has the potential to reveal a great deal about U.S. foreign policy in the Middle East. What messages were diplomats most concerned about conveying to foreign dignitaries? What were the most frequent or concerning issues that faced policymakers? Did those patterns shift with each new presidential administration and how did they correlate to events on the ground or shift during election years? These questions would be difficult to answer by a single historian doing a close reading of the documents in FRUS, but are possible with the use of topic modeling software.
There are limits to this, of course. MALLET groups a limited number of topics in an unsupervised model. This can often create what Jockers calls a black box. The goal of a historian is to be able to interpret the results and draw conclusions, but some topics may be incomprehensible. This doesn’t necessarily have to be a bad thing. There will be topics produced that are unclear or false due to the presence of ‘stop words,’ but the clear topics can still be conceptualized and interpreted. As Jockers states, “we do no disservice to the overall model, and we in not way compromise our analysis” (129).
Therefore, my print project will utilize MALLET to perform a LDA topic modeling of the Eisenhower Administration’s foreign policy in the Middle East. Using the State Department’s publications of FRUS between 1953 and 1960 relating to Mideast policy, I will explore which topics are most prevalent throughout that period. Depending on time and the availability of resources, I will also compare topics from the Eisenhower Administration to the Truman, Kennedy, Johnson, Nixon, Ford, and Carter Administrations to examine how topics have shifted or remained important at different stages of the Cold War and during different presidential administrations.
One Reply to “A Macroanalysis of FRUS: Topic Modeling Middle Eastern Policy”
Applying Topic Modeling to the FRUS corpus seems like a promising idea. As happened with Blevins research, I imagine just tinkering with the way that a tool like MALLET can cluster parts of the texts will open interesting questions and issues to explore. As you note, the way MALLET works it ends up doing some arbitrary things about the number of topics it clusters a text into. But the good news on that front is that even just tinkering with that number and with things like stop words you can start to surface different kinds of patterns and trends to work from.
I think the two biggest things for pulling this off are that 1) you are going to need to be able to get a good high quality copy of the text of FRUS and 2) you are going to need to get up to speed on how to use MALLET to do topic modeling. On the latter point, there is some good documentation and tutorials on how to use MALLET, but it is a tool that takes some work to get use to. So you would want to make time for that. On the first point, it’s really a question of where you can get access to the data and if you can get it in a form that you could make work well for MALLET.