Our readings this week covered description and arrangement in digital preservation and challenged the effectiveness of archival principles respect des fonds and provenance for new media, objects.
Database nature of new media objects
Lev Manovich details how new media objects are essentially databases. Digital objects are a layered collection of items. Users can interact with the same digital object in a variety of ways, meaning the objects lack a linear narrative.
Manovich introduces videogames as an exception. On the surface level, players interacting with the game follow a narrative and pursue defined goals. However, Manovich goes on to clarify that to create a digital object is to create “an interface to a database” and that the content of the work and its interface are actually separate. Even while playing a video game, which seems to follow a narrative, players are only going to points mapped out by the database creators. The database nature of new media objects contrasts the narratives often provided by analog objects, meaning new methods for describing and arranging digital objects are needed.
Describing New Media Objects
Professor Owens details Green and Meissner’s suggestion of More Product, Less Process (MPLP). Green and Meissner believe that organizations should avoid putting preservation concerns before access concerns. Collections should be minimally processed so that they can be accessed by researchers sooner. Item level description should be provided rarely. For arrangement and description, archivists should strive for the “golden minimum.”
Owens provides the 4Chan Archive at Stanford University as an example of using the MPLP approach for digital objects. The archive is available as a 4 GB download, an example of quick and easy access. Stanford opted to include limited but informative description, including the scope of the collection and metadata for the format, date range, and contributor.
Owens also states that digital objects are semi-self-describing due to containing machine-readable metadata. Owens uses tweets as an example. Underneath the surface, tweets contain a lot of informative metadata, such as the time and time zone.
In an effort to describe Web Archives, Christie Peterson tested Archivists’ ToolKit, Archive-It, DACS, and EAD. Peterson found that the “units of arrangement, description, and access typically used in web archives simply don’t map well onto traditional archival units of arrangement and description.” Discussing Archive-It, Peterson describes the break-down of the tool. Archive-It uses three categories: collections, seeds, and crawls. An accession of a collection of websites would be a crawl. Peterson found that there were no good options for describing a crawl. She could not say what the scope of the crawl was or explain why certain websites were left out. This means current tools and methods leave archivists unable to document their activity, creating a lack of transparency.
Challenging Archival Principles
Owens defines original order as “the sequence and structure of records as they were used in their original context.” Original order maintains context and saves time and effort from being spent reorganizing and arranging content, leading to faster access. However, maintaining original order can be difficult for digital objects.
Jefferson Bailey describes an issue with following traditional archival principles with digital objects. Since every interaction with a digital object leaves a trace of that interaction, there is no original order. Bailey explains that with new media objects, context can “be a part of the very media itself” since digital objects can be self-describing. Attempting to preserve original order is unnecessary as meaning can be found “through networks, inter-linkages, modeling, and content analysis.”
Bailey also gives a history of respect des fonds. This principle comes from an era of, and thus is designed for, analog materials. Respect des fonds made the organization of records focus on the creating agencies. Some critiques of the principle are that there is not always a single creator, those who structured the documents may not be the creators, and that original order “prioritizes unknown filing systems over use and accessibility.”
Jarrett Drake asserts that provenance is an “insufficient principle” for preserving born-digital and socially inclusive records due to its origins rooted in colonialism. The provenance principle asserts that records of different origins should not mix. The principle became popular in the United States in the early 20th century, when few were able to own and control their records.
When it comes to digital objects, Drake states “the fonds of one creator are increasingly less distinct from the fonds of other creators.” He provides the example of Google Drive, which allows multiple people to collaborate on document creation. Another change in the times that affects provenance is the rise in people who are able to create and own their records. Nowadays, people are able to name and describe themselves. According to Drake, archivists should support this and name creators in archival description according to their self-assertion.
According to Owens, using community-provided descriptions is becoming popular. To create the online exhibition The Reaction GIF: Moving Image as Gesture, Jason Eppink asked the Reddit community for canon GIFs and descriptions of them. Eppink wanted to mark what GIFs meant to those who used them and getting the description directly from the community enabled him to do that.
Our readings also assert that, when dealing with multiple copies, it’s easier to keep all of them. As Catherine Marshall states, “Our personal collections of digital media become rife with copies, exact, modified, and partial.” One copy may have better metadata, another better resolution, and so on. We have so many copies that the “archival original” is decentralized and not straightforward to determine. Marshall states that it is better to keep these copies than delete them. This is due to people having too many copies, storage being so cheap, and people not knowing which copy they’ll want in the future.
Discussion Questions
Our readings lately have been asserting the value in allowing communities to describe their records. In chapter 7, Owens points out that giving description over to the end user can “easily result in spotty and inconsistent data.” How can archives maintain a balance between empowering communities and keeping quality, consistent data?
What are your thoughts on permitting anonymity in archives? Do you think that it’ll lead to doubt over the validity of the record later on? How can archives demonstrate truthfulness in a record while protecting the creator’s identity?