Practicum: Google N-Gram

An Easy and Unique Research Tool

Google NGrams Home Page https://books.google.com/ngrams

Google N-Gram is a search engine that allows users to explore words or phrases that appear in books ranging from 1800-2019. Users have the option to change the language and dates they are searching within. Results are shown on a graph, providing users with a visualization of the frequency a word or phrase has been used over time. The case sensitivity can be altered as well as the measurement of the frequency by percentage.

Searching “Malcolm X,” from 1950-2019 in American English gives us the following results. This search shows that in American English literature, Malcolm X was at peak popularity in 1970, 1995, and 2012.

Users can also combine words and phrases to get advanced results. For example, searching “Malcolm X, Black Power,” gives us the following results showing the trends of both subjects in comparison to one another.

Both subjects follow a similar trend initially, however, “Malcolm X” spikes in the early 1990’s and goes back down in the late 2010’s, while “Black Power” steadily increases. After being presented with the graph, users can scroll down and choose from a selection of books organized by groups of dates.

Clicking on “1971-2006” for Malcolm X provides us with an abundance of books on Malcolm X published during that time period. Google N-Gram is a great resource for anyone researching a topic (or topics) who wants to know when that topic(s) was the most popular. Whether you are a seasoned researcher, or just getting started, Google N-Grams provides users with an easy and unique option.

3 Replies to “Practicum: Google N-Gram”

  1. Hey Katie! This is a pretty cool resource for people looking to examine a specific word or phrase throughout a given time frame. I am wondering if Google provides information on how often they add books and if they have openly discuss their criteria for which books get scanned. Based on this information, would the cons of this data outweigh the potential pros? Employing our Data Feminism reading, how could we apply the seven principles to this resource?

    1. Hey Joshua, thanks for your comment! I have done a little bit of digging into whether or not Google provides information on how often they update their corpora and what criteria they follow when adding books, and have stumbled upon this info site https://books.google.com/ngrams/info Towards the bottom of the page, the corpora is discussed in more detail. They provide a brief description of each corpus, and claim that, “All corpora were generated in July 2009, July 2012, and February 2020; we will update these corpora as our book scanning continues, and the updated versions will have distinct persistent identifiers.” In terms of pros and cons for this kind of resource/data, I think that users should be aware of the potential limits of the corpora as it may be skewed towards the interests of the people creating the resource, or information/books that are more popular or widely circulated. I imagine that applying the 7 principles of Datafemiminsm to this resource would result in a deeper understanding of who is in power of this resource, how they determine what is included in this corpora, what audience they are catering to, and to think critically about how and why these power dynamics exist, and how this resource would function differently if more perspectives were included in its development.

  2. Hi Katie! I thought this was an excellent demonstration of this resource and that it was clear to me how it works. I think it would be a great thing for people to find out from a historiographic lens who wrote about what during which period of time. I think that it is interesting the developers chose 1800-2019 for their date range, and this brings up two questions. One, have they updated their date range to include at least 2021? Two, Does this resource go back any farther than 1800, say to the 1700s?

Leave a Reply

Your email address will not be published.