We started out talking about the theory, but this week’s readings really got into the nitty gritty of how to initiate and sustain digital preservation projects.
Where do I start? What’s involved?
Owens’ chapter points out three major elements of preservation required to save “the bits”. We need to create and maintain multiple copies of digital objects, use fixity checks to ensure you can account for all the information in those digital objects, and ensure the security of those digital objects so that they can’t be corrupted or deleted by some unsavory sort.
These are our basic elements, and the folks from POWRR (From Theory to Action: Good Enough Digital Preservation) want to emphasize that when you’re starting out, it’s best to focus on the basics. It’s easy to get overwhelmed by all the technical aspects of digital preservation, but it’s really an incremental process that you can work up to. Before maintaining those multiple copies, fixity checks, and working on security, it’s a good idea to take stock of your institution’s strengths and abilities, and consider what kind of resources it can devote to digital preservation, start planning ingest workflows, and creating a basic inventory of your collection.
Owens reiterates this last suggestion: start out by creating an inventory of what you have, and start thinking about policies and practices that will help you manage that collection. (Make sure you keep that inventory up to date!)
So actually, how do I start “doing” digital preservation?
You’ve got a sick inventory now, and we can get started on preserving those bits. Owens suggests running a fixity check to take stock of each digital object at the start, and then moving on to making copies. Both Owens and the NDSA indicate that it’s generally best practice to keep at least 2-3 copies, and to store those copies each in different ways and locations, so that each copy faces a different type of disaster risk. How do you do that though? Actually, a lot of institutions collaboratively form “consortia” like MetaArchive and Data-Pass where “one institution [hosts] a staging server, to which the other partner institutions transfer their digital content.” (From Theory to Action) So multiple organizations can help each other out with storing their digital content. Sweet. Let’s be friends. (You send them some copies.)
Oh, but that first fixity check wasn’t enough. You’re not done now. You just made a bunch of copies of your files and transferred them to your bud to store! Run another fixity check (maybe using a sweet cryptographic hash or checksum) to make sure that all your files got copied correctly. Any time you make new copies, or transfer those copies you gotta check those files to see if they’re still identical to the originals! Also– it’s probably a good idea to run some fixity checks periodically to make sure everything’s chill.
But say— what if everything’s not chill?
You’ve got some numbers that just aren’t adding up, could it be that some of your files got corrupted? You gotta fix those. Using the results of your fixity check you can identify which files aren’t totally correct and try to make new, better copies, or you can attempt to repair the file. “This is done by replacing corrupted data with the distributed, replicated, and verified data held at “mirroring” partner repositories in multi-institutional, collaborative distributed networks. The consortia groups MetaArchive and Data-PASS use LOCKSS (“Lots of Copies Keep Stuff Safe”) for this kind of distributed fixity checking and repair. ” (NDSA Storage Report)
So remember those copies you sent to your friends? Because you have multiple copies of your stuff, you can use those to help fix all your broken ones! Sweet, geographic redundancy really pays off.
Am I done?
NO!
We still gotta think about security and access!
Security could be its own whole thing, but really this involves determining who has access to your files and controlling what they can do with those files. Keep logs of who accessed files, and what they did to those files. If you don’t have any fancy database software to keep track and control access to those original files, Owens suggests you could simply keep those files on a hard drive in a locked drawer and there you go– no one’s deleting that stuff.
And access is the whole reason we’re doing any of this! How will you provide people with those files? Will anything be restricted? Certainly, some of your digital files will have information that shouldn’t just be publicly accessible, or maybe your donor doesn’t want anyone to read those files for a while. If that’s the case, it may be a good idea to stick that into a dark archive, which will preserve your stuff, but no one will be able to read it. Or, if your stuff is less sensitive, maybe it could just be made available online. Your organization should probably develop policies specifically for security and access to your collections.
So we’ve covered maintaining multiple copies, running fixity checks, and security! I think we’re good.
Questions I guess?
So I know I really glossed over these processes, but I wanted to talk more about the preservation of specific file formats, which I think both Owens and the “Party like it’s 1999” reading about emulation seemed to touch on. How do you determine the feasibility of saving a particular file? There are hundreds of different types of proprietary file formats that have come and gone over the years, but how do you determine if you should migrate a file to a more common, modern format, or if it’s necessary to emulate an environment that enables you to experience the file as it was originally intended?
Are there risks of losing some of the affordances of a specific format when migrating to a new file format? If it’s possible to preserve an original file bit-for-bit, would it be more authentic to keep it as is and provide access through an emulated environment? or are we less concerned with the authentic, artifactual experience of that file and more concerned with the information?
I know that the answer to these questions is more so “it depends” or “it’s contextual”, but I more want to see people’s personal thoughts on emulation. I know it’s a complex process to create emulators, but once we are able to successfully emulate past operating systems, can you see emulation becoming “best practice” for digital preservation and access?
Hi Maggie, So far as I can see, emulation will just be one of many options for preserving content. Thinking back on Trevor’s World of Warcraft example, emulating the environment won’t recreate the experience unless you see how people interact with each other in the game. If 50 years from now, you could emulate the environment so that people could play together, that still won’t tell you about the culture that existed when people were originally playing.
I like the idea of emulation in certain contexts because I think understanding how users were able to interact with digital objects at the time they were created or used can be informative for someone who either never experienced that environment personally or has since forgotten how to. Even if emulation isn’t used for every digital object, it might be worthwhile to emulate as many different environments as possible as a reference point for historical purposes.
I don’t know if emulation will ever become an integral part of every digital repository. I hate to take such a negative view of our archival profession, but there still is a fear of digital preservation as being an act that is so … digital (for lack of a better word) and totally removed from traditional archival practice. We all know the intersections between digital preservation and traditional archival practice, but we see it in our partner organizations in the consultant project and a little in the article from Digital POWRR: a fear of the digital object.
I think emulation will definitely continue to be an important tool, especially in repositories that have a focus on video games and other types of interactive digital media, but may not be widespread. It is all dependent on the preservation intent of that repository as to whether it is important for them to emulate or to migrate. I think for ease of access and archiving a lot of repositories will be focusing on preserving the information within a file rather than a purely “authentic” file.
The effort it would take to create an emulator might outweigh its research benefits. If researchers only need the object for its informational value, then archivists should not dedicate their time to creating an emulation. I think emulators should be created when the environment of the digital object is important for the archive’s preservation intent.
I honestly doubt that emulators will be considered best practice (at least in the foreseeable future) due to the lack of staffing and digital preservation expertise that many archives deal with. Migration is the easier and faster preservation method and with many archives following the More Product, Less Process method, quicker access might be the institution’s priority.
I don’t want this to sound like I don’t appreciate the value of emulators. I definitely do, but I feel like practicality is an important factor here.
I think emulation will serve some institutional missions and curatorial objectives some of the time but that may not be enough to sustain the enterprise, if there aren’t enough stakeholders. As a result, this might not be such a hot topic down the road in a course like this. I just can’t escape the impression that emulation appeals to a sense of nostalgia, and that’s a moving target. I’m not sure we’ll be able to keep building new time machines.
Tbh the part of this week’s readings that blew my mind the most was in Step 3 of the emulation article, when she says: “Do you remember how to use DOS?” I feel a little silly now, but I hadn’t even THOUGHT of this. Maybe because I kept thinking of emulation of video games, most of which I feel like I could comfortably figure out how to use (or remember using at some point in my life). But all this talk about old word processors and I realized that wow, there are so many things I don’t know how to use, and would probably get SUPER frustrated trying to figure out. And if that’s hard for ME, imagine teaching it to younger generations who are used to the speed and user-friendliness of iPhones that they’ve been using THEIR ENTIRE LIVES. The staff and time costs of having to help people use emulators is something I hadn’t even considered, and man… that seems rough.
Perri, I like that you touch on the economics of actually implementing emulators. I see emulators as a tool to pull people (usually new or unfamiliar) into the archive. It’s how I think of any digital archive or collection, but this is specifically about emulators. While it’s cool that we can experience a game the same way one would have back in the day, trying to teach younger generations to use these tools when they are used to “better” more contemporary options would be difficult with already-existing staffing and budgetary constraints in the majority of cultural institutions.
To me, there is a time and place for emulators. Like David and tina said, they are great for nostalgic experiences, but it seems like a lot of work to emulate Word Processor than it would be to migrate it to a Word doc or PDF. It all goes back to what your institution wants to preserve. Similar to Maya, if you are only going for the informational value, it would be counterproductive to spend the limited time and funds of an archive on a complicated emulator if you could potentially just re-save a file in a different format.