Alright kids, buckle in! Today I’m talking about the TIME Magazine Corpus, which is run by BYU and was first released in May of 2007. The TIME Magazine Corpus allows you to explore how the usage of a word has changed over time, based on 100 million words from about 275,000 articles from TIME magazine. Not only can you look at the frequency of a word’s usage, you can also investigate what words are used most often in conjunction with it. For example, if you search the word “civil,” you can see how often it was used in the magazine between the 1920s and the 2000s, as well as how often it was used in conjunction with the words “war” or “rights” as well as a bevy of other nouns.
If you’ve never used a system like this before, it is slightly less then intuitive; a lot of my learning was through trial and error searches, which used up quite a few of the 50 queries a day I was allocated as a “nonresearcher.” However, scattered throughout the Interface are helpful searches, tips, and tricks to help you gain the most out of your searches.
So how do you use it?
First of all, you need to make an account. The corpus lets you make 10 – 15 queries before requiring this step, but as a register you can make more queries. Importantly, you can also save your searches to return to.
The home page also offers a “five minute tour” to get you situated before you begin. This is helpful and I highly recommend it, although the help box on the search page is where I learned most of my querying skills.
As you research a word, you have five different search options:
In this post I’ll go briefly into each option. Please note that there are much more complicated searches you can do using this tool. For the purposes of this blog, and because I am learning along with you, I will be going over the most basic options.
The first is “List.” Searching a word here takes you to the frequency page, which shows you how many times a word has appeared in TIME Magazine. If you select “Context” after searching, the screen will show you how the word appeared in the sentence it was located in.
If you wish to add complexity to your search, you can try adding a search layer. For example, if you search a word, such as “civil” + NOUN, it will demonstrate the frequency of the nouns that usually follow the word “civil”
The second option is “Chart,” which acts similarly to the “List” option, but gives the researcher a chart and bar graph, and allows the researcher to break down the frequency by year within a decade.
The third option, “Collocates,” acts similarly to the “word” + NOUN option under List. Collocates are words that often occur near other words. So in this case, a collocate of “civil” would be “war” or “rights.” This option will show you which words are most often placed before or after the word you are searching for. If you want to see how collocates have changed over the years, you can sort your results by decade and see when each collocate was most frequent.
The fourth option “Compare,” allows you to compare the collocates of two words. For example, you might search “civil” and “national” to compare what words are often near by.
Finally, the fifth option is KWIC (Key Word In Context). As with Collocates, this demonstrates words that are often used surrounding the word you are researching. However, KWIC demonstrates a larger pattern, and provides a context larger than the pairing of two words.
Using the TIME Magazine Corpus was both exciting and frustrating in equal measure. I could see how it could be useful: there were so many searches possible, so many combinations of parts of speech, time frames, collocates, and phrases. The examples given in the help box grab your attention. For example, if you search “*dom in 1920 – 1940s,” the corpus provides you a list of all the words ending in “dom” used during that period.
Or, you could search “nouns near chip in 1980s – 90s. vs 1940s – 50s.” (This one was a particular favorite of mine.)
These are incredible depictions of how the words we use change over time, and how some words are more important at one time than another. In my searches I could see when the word “civil” changed from being inherently connected to “war” and started referring to “rights” and that’s really cool!
But I was also really frustrated by my inability to correctly use the tool. The TIME Magazine Corpus uses a complicated interface (for example, you can’t use the “back” button in your browser, so you need to retrain yourself to use their navigation bar), and sometimes it isn’t clear what each search option means. In addition, elevating your search takes very specific programming instructions that were often, on face value, unclear to me. I’m sure spending time on the platform would ease my usage of it, but with only 50 queries per 24 hours (per email – I ended up using two to get around the rules and spend more time learning) it is hard for a novice to learn the rules of the game, and how to create an intricate, enlightening search. To the creators’ benefit, they do offer A LOT of helpful links, instructions, and example searches, but the learning curve is definitely steep.
What do you guys think? Should I spend more time exploring the corpus, or should I switch over to Google nGram, where Jonah seemed to have a much more productive time? Do you have any tips on how to make search processes easier, or words that you would be particularly interested in learning about?
5 Replies to “The TIME Magazine Corpus of American English”
I’m impressed at how far you went to secure more searches on the TIME Magazine Corpus. I feel like I definitely wasted my first 15 just trying to figure out which buttons to press. This application feels very unwieldy to me as well in comparison with the Google Ngram Viewer, however I appreciate that the TIME Corpus offers a snippet of context where Google only offers the frequency. But as Jonah points out, I guess that just facilitates your motivation to pursue deeper research. I find both applications frustrating in terms of content limits—but I suppose they’re just supposed to be jumping off points. I’m interested in how these will be used for our projects this semester. Thanks, Katie, for a great guide!
I’m glad you pointed that out! I did find that looking both at the collocates and the word in context were potentially the most useful. Sure it’s interesting to see how often a word is used, and that’s potentially a greater link to other cultural resources. But really it’s that context that can help us answer larger questions in our research – it helps us to figure out not only when a word is used, but how it is used, and what it means in the greater world around it.
I’m glad you told me to buckle in, because this site is kind of a trip. I wonder if the user interface on this site has been updated since May of 2007—it’s a lot more unwelcoming than I’d expect of a site made more recently. Sean pointed out the problems with unfriendly UI in his post about HistoryWired, and I think those criticisms apply here as well. Obviously the corpus has the potential to be a supremely useful tool for studying how things have been talked about and with what frequency they have been talked about over the decades, but if that potential isn’t accessible to people who don’t have a friendly classmate to show them the ropes, it’s going to go largely unrealized.
I particularly wonder at the query limits. I can see how they might have been a good idea at one point, but I’d be curious to hear how the team justifies keeping them around in 2019. Generally I’m opposed to sites that require you to make accounts for no apparent reason, but I especially question the logic of maintaining a query limit, even an extended one, for users with accounts.
This seems like a really great resource that’s been let down by some poor decisions about the user experience!
That’s exactly it. One of my biggest annoyances was the poor user experience. I didn’t mention in the original blog, but every few searches the machine will freeze and prompt you to upgrade to a premium account. While the prompt is flashing you can’t search anything. It goes away in about 30 seconds but playing the waiting game makes the whole process more frustrating. I feel like if you really used a tool like this often, you would get around many of the things I found upsetting, but like I said, I’m not sure a novice would get to that point.