Alright kids, buckle in! Today I’m talking about the TIME Magazine Corpus, which is run by BYU and was first released in May of 2007. The TIME Magazine Corpus allows you to explore how the usage of a word has changed over time, based on 100 million words from about 275,000 articles from TIME magazine. Not only can you look at the frequency of a word’s usage, you can also investigate what words are used most often in conjunction with it. For example, if you search the word “civil,” you can see how often it was used in the magazine between the 1920s and the 2000s, as well as how often it was used in conjunction with the words “war” or “rights” as well as a bevy of other nouns.
If you’ve never used a system like this before, it is slightly less then intuitive; a lot of my learning was through trial and error searches, which used up quite a few of the 50 queries a day I was allocated as a “nonresearcher.” However, scattered throughout the Interface are helpful searches, tips, and tricks to help you gain the most out of your searches.
So how do you use it?
First of all, you need to make an account. The corpus lets you make 10 – 15 queries before requiring this step, but as a register you can make more queries. Importantly, you can also save your searches to return to.
The home page also offers a “five minute tour” to get you situated before you begin. This is helpful and I highly recommend it, although the help box on the search page is where I learned most of my querying skills.
As you research a word, you have five different search options:
In this post I’ll go briefly into each option. Please note that there are much more complicated searches you can do using this tool. For the purposes of this blog, and because I am learning along with you, I will be going over the most basic options.
The first is “List.” Searching a word here takes you to the frequency page, which shows you how many times a word has appeared in TIME Magazine. If you select “Context” after searching, the screen will show you how the word appeared in the sentence it was located in.
If you wish to add complexity to your search, you can try adding a search layer. For example, if you search a word, such as “civil” + NOUN, it will demonstrate the frequency of the nouns that usually follow the word “civil”
The second option is “Chart,” which acts similarly to the “List” option, but gives the researcher a chart and bar graph, and allows the researcher to break down the frequency by year within a decade.
The third option, “Collocates,” acts similarly to the “word” + NOUN option under List. Collocates are words that often occur near other words. So in this case, a collocate of “civil” would be “war” or “rights.” This option will show you which words are most often placed before or after the word you are searching for. If you want to see how collocates have changed over the years, you can sort your results by decade and see when each collocate was most frequent.
The fourth option “Compare,” allows you to compare the collocates of two words. For example, you might search “civil” and “national” to compare what words are often near by.
Finally, the fifth option is KWIC (Key Word In Context). As with Collocates, this demonstrates words that are often used surrounding the word you are researching. However, KWIC demonstrates a larger pattern, and provides a context larger than the pairing of two words.
Using the TIME Magazine Corpus was both exciting and frustrating in equal measure. I could see how it could be useful: there were so many searches possible, so many combinations of parts of speech, time frames, collocates, and phrases. The examples given in the help box grab your attention. For example, if you search “*dom in 1920 – 1940s,” the corpus provides you a list of all the words ending in “dom” used during that period.
Or, you could search “nouns near chip in 1980s – 90s. vs 1940s – 50s.” (This one was a particular favorite of mine.)
These are incredible depictions of how the words we use change over time, and how some words are more important at one time than another. In my searches I could see when the word “civil” changed from being inherently connected to “war” and started referring to “rights” and that’s really cool!
But I was also really frustrated by my inability to correctly use the tool. The TIME Magazine Corpus uses a complicated interface (for example, you can’t use the “back” button in your browser, so you need to retrain yourself to use their navigation bar), and sometimes it isn’t clear what each search option means. In addition, elevating your search takes very specific programming instructions that were often, on face value, unclear to me. I’m sure spending time on the platform would ease my usage of it, but with only 50 queries per 24 hours (per email – I ended up using two to get around the rules and spend more time learning) it is hard for a novice to learn the rules of the game, and how to create an intricate, enlightening search. To the creators’ benefit, they do offer A LOT of helpful links, instructions, and example searches, but the learning curve is definitely steep.
What do you guys think? Should I spend more time exploring the corpus, or should I switch over to Google nGram, where Jonah seemed to have a much more productive time? Do you have any tips on how to make search processes easier, or words that you would be particularly interested in learning about?