Voyeur: A Text Analysis Tool

Voyeur is a free online text analysis tool that is being constructed as part of the Hermeneuti.ca project. On their site, creators Stefan Sinclair and Geoffrey Rockwell define Hermeneuti.ca as a way to “think through some foundations of contemporary text analysis, including issues related to the electronic texts used, the tools and methodologies available, and the various forms that can take the expression of results from text analysis.”

Voyeur works well with the overall mission of the Hermeneuti.ca project because it allows users to explore the potential use of text analysis in a free and somewhat user- friendly space. In order to use the program, users simply enter or upload a text into the white box on the main page and then click the reveal button. Once the text has been “revealed” users can learn useful information such as how many times a word has been used, the distribution of the use of that word, the vocabulary density of a document, and the number of distinctive words used. Furthermore, when you click the small arrow at the bottom left of the screen; it will also show you word trends and your keywords in context. One of the more useful components of Voyeur is that it allows researchers to analyze both a corpus of documents or individual sources.

In order to demonstrate the full usefulness of the program, the site contains a helpful article “Now Analyze That” that demonstrates how researchers used Voyeur to analyze speeches made on race by Barack Obama and Jeremiah A. Wright Jr. They used the program to identify each speaker’s political priorities and overall views of race relations. Examples such as this help researchers new to text analysis learn how to effectively use this research tool.

Because the program quickly counts how many times a word is used within a given text, I found Voyeur to have the most potential for historians interested in using quantitative analysis to study rhetoric in texts. For example, as a historian interested in gender, I could use Voyeur to scan a primary source and see how many times and where gendered language appears in a text. I could then use this information to see how gendered language is used in that particular text to create concepts of masculinity and femininity. While historians of gender have long sorted through sources for evidence of gendered language, Voyeur can now allow us to do it in a much quicker and more efficient way. Furthermore, Voyeur’s ability to search multiple documents at a time provides historians with a convenient tool for analyzing specific themes within a group of documents.

While Voyeur is still under construction, I found the site to have much potential for researchers. Although I did have some trouble navigating all the tools of the program at first, the site as whole offered a plentiful (sometimes overwhelming) amount of tutorials and articles that help novices to text analysis find their way. Furthermore, while going through this site, I particularly learned how digital media sites such as this one, can both change how historians look at sources and expose scholars to new forms of research.

If you use text analysis in your research do you find this program helpful? How do programs such as this change the way the historian researches? How can Voyeur be used to help us find new themes within documents?

Visualizing Your Data With IBM’s Many Eyes

Many Eyes is a powerful tool that enables a user to create visualizations from any kind of data set.

Here’s where it gets fun: while a user can upload their own data set, Many Eyes is a community-powered tool. There are over 150,000 data sets to choose from, and many are pre-visualized.

Another (seemingly underused) feature are Topic Centers. Topic Centers allow teams of people to collaborate on visualizations. Topic Centers are organized around certain topics (makes sense, right?), as well as teams of people at organizations and classes (like this one).

Here are some examples:

Average Time Spent Commuting by State Many Eyes
Average Time Spent Commuting by State

Number of arrests by age and type of crime Many Eyes
Number of arrests by age and type of crime

News Blogs Dominated By A Few Startups Many Eyes
News Blogs Dominated By A Few Startups

But selecting a dataset from the community is not always the best option: the metadata associated with many of the datasets is inaccurate or incomplete. Rest assured, because what makes Many Eyes such a versatile tool is that any type of data is accepted, so long as it is in a structured format. Data needs to be pre-formatted in Microsoft Excel (or similar spreadsheet software), then pasted into Many Eyes’ Web interface.

Then the user is presented with an array of visualization options, from tag clouds and word trees to assorted graphs and even maps.

A couple of potential uses for historians:

  • Take a historical text or speech (i.e. the Gettysburg Address) and create a tag cloud from it, where the more frequently a word is used, the larger it will appear.
  • Create a network diagram to visualize a historical figure’s family tree.
  • Use a map to show population trends over time.

Over the summer, I took air traffic control data and visualized it using Many Eyes, for fun. It was easy to use every step of the way. In fact, it’s so easy to use, the hardest part should be finding the data in the first place.

It is beyond imperative to have good visuals when working on the Web, since readers hate long blocks of static text. Bringing a history project to the Web calls for the use of visualizations like those that can be generated using Many Eyes. It will make your work more attractive, and will certainly help your readers understand things better. At the end of the day, it’s all about them!