As I was reading for class this week, I couldn’t help but see all of the pieces through the lens of the research projects I’m currently engaged in, because these questions of access (Of what? For who? How?) have been so central to all three projects. It might be that my projects are more focused on access, but to my mind, it’s much more likely that access is the reason for any archival endeavor. As Owens points out, “The purpose of preservation is enabling access.”
There were way too many interesting threads in the readings for this week, so I picked a few that particularly hit home for me:
Screen Essentialism
Owens points out that cultural heritage institutions often want to hew more closely to the “boutique” approach of digital access, rather than a “wholesale” one, and while a “boutique,” curated approach to access this is generally framed as being more user-friendly, it also comes with the risk that access will remain secondary or tertiary. The more user-friendly frame often means that collections/items aren’t made available until the institution has a sophisticated access system in place. “Screen essentialism” in this view of access refers to the fact that there is no one inherent way of accessing digital objects; Owens urges us to “get over the desire to have the kinds of interfaces for everything where one just double clicks on a digital object to have it ‘just work.'”
Padilla and Higgins too warned of screen essentialism and “data essentialism”; oversimplifying the nature of data and obscuring complexities by viewing both the systems used to locate, process, and understand data, and the nature of data itself. Christen, on the other hand, describes Mukurtu as a system that does need to “just work” and have difficult computational processing happen below the surface, but in this case it’s not a matter of having a single system that works for every possible user, but creating and implementing a system that allows for customization an individual basis, because that is what best serves the collections Christen works with.
Collections as Data, Data as Data
Padilla and Higgins’s piece focuses on defining data, and thinking about digital library collections as pieces of humanities data, especially in how this mindset affects access to digital cultural heritage: “The authors hold that Humanities data are organized difference presented in a form amenable to computation put into the service of Humanistic inquiry.” So, practically thinking, information professionals should be considering how to make collections available and what access points would aid in these collections being “amenable to computation.” Padilla and Higgins’ emphasis on derivative (often DH) projects serving as incredibly useful access interfaces for digital collections, as well as mention of metadata as useful data in its own right, aligns well with the chapter we read this week in Owens’ book.
While I strongly agree with the ideas Padilla and Higgins are putting forth, I do harbor some concerns about how this article, and the larger research project it morphed into (Collections as Data) might be in conflict with archival practice and values. For instance, does the focus on interfaces developed outside of the archive, such as “The Real Face of White Australia” (disclaimer: I’m a Tim Sherratt stan), undercut the importance of contextual relationships between parts of an archival collection? Projects like this aid in access to and understanding of archival material, but are parts of a wider whole that many users may not realize. How can we maintain context (cough, the provenance debate) while also making digital archival collections “amenable to computation”? Are archivists being cut out of this information exchange, and if so, how we do re-insert ourselves?
Discussion Questions
- How does thinking through access impact processing workflows? Does MPLP work as an approach to all collections? How does prioritizing access play into undertaking documentation strategy projects?
- Do you have experience with ethical and/or privacy issues that might prevent you from batch converting and uploading immediately? What about when legal/copyright issues and ethics are at odds? When can you legally make something available but might not want to?
- Owens emphasizes keeping any sensitive material is a risk that information professionals must seriously consider. But one of the projects I’m working on, Safely Searching Among Sensitive Content has made me think about sensitivity in so many contexts – reputational harm, for instance, is incredibly broad. How do you know you’re making the right decision? In addition, during the initial work on developing an access system for email collections containing sensitive material for SSaSC, we found that our platform’s search functions work better when the algorithms have access to the sensitive material, even while accounting for the fact that that material won’t normally be shown the user. How does work like this complicate our thoughts on collecting sensitive material?
- How does thinking about access to only metadata change or not change the way you would process and catalog collections? Does this apply to both descriptive metadata and technical metadata?
- What multimodal methods of access might work for the small institution you’re partnered with? Which would not (currently, at least?)
- In Padilla and Higgins’ piece, they posit that librarians/archivists/info professions are well-suited to “offering training in the skills, tools, and methods needed to take advantage of Humanities data.” Is this the case, on the ground? Why or why not? What are the major challenges we need to overcome, at an institutional and field level, in order to better serve users in this way? Is training in these skills different than simply providing multimodal access?
- Have you seen the feminist HCI values of “plurality, self-disclosure, participation, ecology, advocacy, and embodiment” in practice? How do you anticipate using them?
- Christen opens her article by stating that “Archives have long been ambivalent places for Indigenous communities whose cultural materials are held in their storerooms.” (21) In what ways do we, as a profession, reinforce that ambivalence? Question it? Does multimodal access, as delineated by Owens, ameliorate this ambivalence enough?
In thinking through these discussion questions, I was continuously reminded of Miriam Posner’s blog post, “Money and Time.” Every concern about staff resources and ways to implement access seems to align with the sustainability, resources, and burnout concerns that Posner brings up in relation to DH centers and initiatives: “You can optimize, streamline, lifehack, and crowdsource almost everything you do — but good scholarship still takes money and time.” Multimodal, plural, cultural sensitive access to digital objects and collections still takes money and time.
“Feminism and the Future of Library Discovery” reminded me of an internship that I completed with the Theodore Roosevelt Digital Library. The library’s goal is to catalog every document in Roosevelt’s papers on an item level, and part of my job as an intern was to go through each document (usually letters) and pick out names, places, events, subjects, etc. that could be “tagged” along with the document. Women were frequently mentioned in Roosevelt’s letters, but almost never by their first name, even if they were the recipient of the letter. So, for example, the wife of Elihu Root would be “Mrs. Elihu Root” or “Mrs. Root,” which one could say reduces her to being an appendage of her husband and obscures her presence in history.
The archivist in charge of our cataloging project felt very strongly that in any case in which a woman was mentioned, we should try to find out her full name and tag her accordingly. So “Mrs. Elihu Root” would become “Clara Frances Wales Root, 1853-1928.” Finding this information could sometimes be time-consuming, but in doing it, I often discovered that the women Roosevelt was writing to or about were interesting persons with histories of their own. So “uncovering” women’s names could be considered a first step to giving them more representation in history. The devil’s advocate in me, however, wondered if there weren’t some benefits to tagging such documents as “Mrs. Elihu Root” if that is how people knew her at the time and how she saw herself. It might also make it easier for researchers to ascertain her relationship to Roosevelt.
So anyways…this is admittedly more an instance of feminism in cataloguing practices rather than feminism in human-computer interaction, but I think it is relevant to the themes discussed by Sadler and Bourg. What do other students in this class think about this practice? Have you ever been asked to do something similar?
Another thing that the Sadler and Bourg article reminded me of articles I’ve read recently about “sexist” algorithms.
Some people have pointed out that Google Translate shows gender bias when translating gender-neutral pronouns into English, creating sentences like “he is a doctor” and “she is a nurse.” (https://mashable.com/2017/11/30/google-translate-sexism/#inJWo38zPsqi)
There is also the example of Amazon, which created an algorithm for identifying the best candidates for jobs and had to abandon it when it realized that it was discriminating against women. (https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G)
I am volunteering for a DCIC project now that is experimenting with automated metadata extraction, so I don’t think it is entirely outside the realm of possibility that algorithms may become a bigger part of what we do as digital archivists/librarians. I think it is important that if we do start working with automation in the future, we realize that machines can pick up the same biases as the society that created them.
So with regard to your question about copyright and batch accessibility to records, this is something I’m covering a lot in my Policy and Ethics in Digital Curation course. There are ways that the savvy researcher can forensically dig through your records to find deleted files, or hidden code buried within a file. I’m definitely not totally with it when it comes to understanding the ins and outs of digital files, and I can imagine a lot of older archivists aren’t either, which makes providing access to digital files potentially dangerous.
A specific case of when it was “legal” to publish records but not ethical could be either the James Joyce letters or the JD Salinger letters, which were donated legally to a historical society and made publicly accessible. Salinger uses the argument that archivists need to consider the original intent of the records when deciding to provide access. “J. D. Salinger’s insistence that ‘letters were meant for a certain pair of eyes, and those eyes alone’ is an argument that has no validity in law… although it is not illegal, it is, nevertheless, unnecessary and unethical that, despite Salinger’s outraged protests, ‘many pairs of eyes’ are now are free to examine his letters.”(Cariffe Cirasella, 2000, 90.)
What should we do with donations of harddrives with people’s personal files? What if there are PII files stored amongst an artist’s work? What if it’s just plain embarrassing poetry intended for an ex? Where is the line drawn?
I can envision that third party donations of digital material will increase with time, and we will be in a similar situation as with our “dubiously acquired” third party donated analog materials.
As a sort of bridge between the feminist ethics called for by Sadler & Bourg and the cultural sensitivity of Mukurtu, in my work with Latino DH projects and engagement there is an ongoing discussion on incorporating chicana feminist theory into our DH work, with a heavy emphasis on affect theory and questioning how we archive what is not expressible in words. Ideas like post-custodial archives, decentering the power structure of a physical centralized archive and returning ownership to the record creators… things like that are seeking to use DH to increase access and the scope of collections. I think there is space for archivists to reinsert themselves in the DH conversation, but we definitely need to do a better job of it.
In response to your first question, I would say that MPLP does not always work well when providing access to certain collections, especially digitized collections. When it comes to born-digital materials, MPLP works pretty well, but I would argue that digitized materials, specifically digitized photographs, need more work than simply scanning and uploading. That approach is a good first step, but there is a need for some additional level of human work to add important, although basic, metadata to those newly digitized photos. This is on my mind as I try to determine an efficient, but detailed way of digitizing my family photos. I recently used the DCIC’s negative film scanner and it worked wonderfully, creating beautiful scans that my home printer/scanner could not do. However, if that were an archival collection, and I immediately posted those items to a digital repository, no one would know the who, what, where, when, why. I didn’t even know who was in the photos until I spoke to my mom. So going back to the original question, I think that if enough identifiable metadata is already attached to digital files, or the who/what/when/where/why is evident in the material itself (like a letter) then MPLP is appropriate and those items should be accessible as soon as possible and without much initial human interaction. However, when uploading items without any or possibly confusing metadata, there is a need for some extra work. Not that they need paragraphs of description immediately, just simply the basic identifiable and contextual information needed to use the photograph in an appropriate and ethical manner.
I agree with Jen that MPLP does not work for all collections. For culturally sacred items, like in the Mukurtu example, the stewards of the collection may want to limit access or carefully decide who may access what items. When ethics and advocacy override the urgency for access, I think archives and other cultural institutions should take the time to carefully construct how access may be granted and limited.
For your question about DH, I do believe archivists are well-suited to aid in DH projects. I am reminded of the Dickinson Electronic Archives, which I volunteered for as an undergrad, and MITH here on campus. In another one of my classes, I worked on a group project with MITH to create metadata and an inventory for the Lakeland Community Heritage Project Digital Archive. Both of these are cases where archival theory and practice are used to structure data for humanities interests.
I think one of the greatest challenges keeping archivists from helping on these projects is lack of awareness about archives. If people working on DH projects don’t know anything about archivists or archival theory, they are not going to seek out an archivist. I am not sure how to remedy this, but one way would be to be proactive, like how the SAA archivists were in building the People’s Archive of Police Violence in Cleveland. Archivists could pay close attention to DH issues and reach out through listservs to find people interested in working together on a DH project.