Option #1
For my print project I would like to focus on analyzing the makeup of published authors in history journals, such as the Journal of American History. After completing an article analysis for my research seminar, I was reminded how little diversity there was in authorship in academic articles. With this project I would like to create some kind of data visualization that shows the makeup in authorship of history journals, ideally focusing on one.
I have a lot of options in terms of what kinds of things I could use as data. Some of the data is more readily available, whereas other data I would have to go looking for myself like race and gender identity. When I first imagined this project, I wanted to focus primarily on race and gender as a data point but I am also open to other data points that may be more easily accessible such as job title and academic affiliation or institution. In order to successfully complete this project, I would need a software to input all of my data that can create the type of visualization that I decide. Right now I’m thinking that a bar chart could be a good idea but I’m also looking for more creative options as well. The written part, of course, would consist of my analysis of the demographics of authorship and how that impacts the discipline as a whole.
Option #2
I also have an alternative option where datasets may already exist or would be more easily attainable. Another proposition I have is to analyze keywords or topics of articles in a historical journal and then create a frequency chart or table for the data. There are two ways I can approach this: I can search for data that already exists and put it into some kind of data visualization or use a program that allows me to collect my own data.
For this project I would focus on a singular journal, but I would like to cover at least two to four year’s worth of issues in my analysis. I am open to suggestions about a program that would be able to read all my articles and detect the keywords I’m looking for, so if anybody has any input I would greatly appreciate it. Once I had the data and created a data visualization, I would then write up an analysis, similar to the idea in my first proposal.

I have done a preliminary search of projects that have done similar things to what I’m hoping to do as a guide to help me understand what methods work best and how to structure my analysis around a particular data set. One of the examples that I found really helpful will be linked here and a some graphs from the article can be seen above.
As always, I would love to hear any suggestions, comments, or questions people have about my proposals.
Ava
Hi Ava,
I think both of these sound like potentially viable and interesting projects! For the first project idea, I think your biggest issue will be figuring out which journals you are going to focus on and then building out your dataset. There are really a million different directions you could go in something like this based on your interests and what you find in the literature. For example, do you focus on the most prestigious journals, or do you do comparisons between journals focused on specific topics or time periods? Planning to chart and visualize that kind of data is great, but it would also be good to think through what kinds of interpretations/analysis would result from charting such data. That is, do you think there is likely to have been some interesting or substantial change over time around race and gender of authors in some specific set of journals? If that is the case, then it would be good to go into this thinking about where you are most likely to see the most significant or important differences like that. Along with all that, it is worth noting that when it comes to categories like race and gender, that this is a place where the insights from Data Feminism are rather useful. Quantitative approaches to race and gender tend to treat these as clear cut categories but in reality it’s much more complex and nuanced. So it would be good to go into this kind of project with a plan for how you would deal with that complexity.
The second project idea is interesting too. The fact that you can do full text searches against full runs of journal articles makes this one potentially a lighter lift from a data perspective. On this one, I think you could likely just use the journals websites to do searches for terms and then count up the number of articles per year that include that term. So you could likely do this without needing to get all the data out and load it into some other system. For example, if you were looking for the term “queer” in the journal American Archivist, you go to their online platform and search for it and then just use the facets on the side to count up how many articles by year or by decade there are with that term in them. You could do that kind of thing to pretty quickly get a sense of how many articles have a given term in a journal with just doing those kinds of searches. The big question on this one is what journals would you search through and what terms would you be looking for? This could really go in any number of directions and would mostly depend on what you could build up a lit review to support looking into. This also gets into some of the nuances around specificity of terms and changes in language over time that we got into when we talked about Google n-gram. All of this is to say that you could do a lot of good work on something like this, but the main task at hand would be to zero in on what terms/subjects you would be focused on and which journals are the best places to look into for those terms.