We started out talking about the theory, but this week’s readings really got into the nitty gritty of how to initiate and sustain digital preservation projects.

Where do I start? What’s involved?

Owens’ chapter points out three major elements of preservation required to save “the bits”. We need to create and maintain multiple copies of digital objects, use fixity checks to ensure you can account for all the information in those digital objects, and ensure the security of those digital objects so that they can’t be corrupted or deleted by some unsavory sort.

These are our basic elements, and the folks from POWRR (From Theory to Action: Good Enough Digital Preservation) want to emphasize that when you’re starting out, it’s best to focus on the basics. It’s easy to get overwhelmed by all the technical aspects of digital preservation, but it’s really an incremental process that you can work up to. Before maintaining those multiple copies, fixity checks, and working on security, it’s a good idea to take stock of your institution’s strengths and abilities,  and consider what kind of resources it can devote to digital preservation, start planning ingest workflows, and creating a basic inventory of your collection.

Owens reiterates this last suggestion: start out by creating an inventory of what you have, and start thinking about policies and practices that will help you manage that collection. (Make sure you keep that inventory up to date!)

So actually, how do I start “doing” digital preservation?

You’ve got a sick inventory now, and we can get started on preserving those bits. Owens suggests running a fixity check to take stock of each digital object at the start, and then moving on to making copies. Both Owens and the NDSA indicate that it’s generally best practice to keep at least 2-3 copies, and to store those copies each in different ways and locations, so that each copy faces a different type of disaster risk. How do you do that though? Actually, a lot of institutions collaboratively form “consortia” like MetaArchive and Data-Pass where “one institution [hosts] a staging server, to which the other partner institutions transfer their digital content.” (From Theory to Action) So multiple organizations can help each other out with storing their digital content. Sweet. Let’s be friends. (You send them some copies.)

Oh, but that first fixity check wasn’t enough. You’re not done now. You just made a bunch of copies of your files and transferred them to your bud to store! Run another fixity check (maybe using a sweet cryptographic hash or checksum) to make sure that all your files got copied correctly. Any time you make new copies, or transfer those copies you gotta check those files to see if they’re still identical to the originals! Also– it’s probably a good idea to run some fixity checks periodically to make sure everything’s chill.

But say— what if everything’s not chill?

You’ve got some numbers that just aren’t adding up, could it be that some of your files got corrupted? You gotta fix those. Using the results of your fixity check you can identify which files aren’t totally correct and try to make new, better copies, or you can attempt to repair the file. “This is done by replacing corrupted data with the distributed, replicated, and verified data held at “mirroring” partner repositories in multi-institutional, collaborative distributed networks. The consortia groups MetaArchive and Data-PASS use LOCKSS (“Lots of Copies Keep Stuff Safe”) for this kind of distributed fixity checking and repair. ” (NDSA Storage Report)

So remember those copies you sent to your friends? Because you have multiple copies of your stuff, you can use those to help fix all your broken ones! Sweet, geographic redundancy really pays off.

Am I done?

NO!

We still gotta think about security and access!

Security could be its own whole thing, but really this involves determining who has access to your files and controlling what they can do with those files. Keep logs of who accessed files, and what they did to those files. If you don’t have any fancy database software to keep track and control access to those original files, Owens suggests you could simply keep those files on a hard drive in a locked drawer and there you go– no one’s deleting that stuff.

And access is the whole reason we’re doing any of this! How will you provide people with those files? Will anything be restricted? Certainly, some of your digital files will have information that shouldn’t just be publicly accessible, or maybe your donor doesn’t want anyone to read those files for a while. If that’s the case, it may be a good idea to stick that into a dark archive, which will preserve your stuff, but no one will be able to read it. Or, if your stuff is less sensitive, maybe it could just be made available online. Your organization should probably develop policies specifically for security and access to your collections.

So we’ve covered maintaining multiple copies, running fixity checks, and security! I think we’re good.

Questions I guess?

So I know I really glossed over these processes, but I wanted to talk more about the preservation of specific file formats, which I think both Owens and the “Party like it’s 1999” reading about emulation seemed to touch on. How do you determine the feasibility of saving a particular file? There are hundreds of different types of proprietary file formats that have come and gone over the years, but how do you determine if you should migrate a file to a more common, modern format, or if it’s necessary to emulate an environment that enables you to experience the file as it was originally intended?

Are there risks of losing some of the affordances of a specific format when migrating to a new file format? If it’s possible to preserve an original file bit-for-bit, would it be more authentic to keep it as is and provide access through an emulated environment? or are we less concerned with the authentic, artifactual experience of that file and more concerned with the information?

I know that the answer to these questions is more so “it depends” or “it’s contextual”, but I more want to see people’s personal thoughts on emulation. I know it’s a complex process to create emulators, but once we are able to successfully emulate past operating systems, can you see emulation becoming “best practice” for digital preservation and access?

Figuring out Digital Preservation

Before this week, digital preservation used to seem like this insanely complicated and overly technical process that I never thought I could truly understand. My background is in history and fine arts, I never took an advanced math or science class in highschool or college because I thought that it just wasn’t for someone like me. These readings proved to me that literally anyone can grasp the concepts behind digital preservation, provided that it is explained in accessible terms.

I feel like there were a couple different themes that repeated through these readings, namely that there is no one way to “do” digital preservation, and digital preservation isn’t an “all or nothing” process; you don’t have to go all out, it’s okay to start small and work your way up.

To start, Professor Owens’ chapter, The Craft of Digital Preservation, introduces this idea that digital preservation is a “craft, not a science”(Owens, 72.) Meaning, there is no one set way to “do” digital preservation, no single answer, but instead it is something that requires planning and thought, that must adapt to specific situations, and changes over time. Owens suggests “part of the idea of digital preservation as craft is that there isn’t a single system for doing digital preservation. It is a field that requires continued refinement of craft.” (Owens, 79.) There will always be the need for improvement and adaptation of principles to meet a specific institution’s needs, no one framework or model can “solve” digital preservation for you. Owens continues to warn against developing an uncritical reliance upon frameworks or models, stating that “these  frameworks are useful as tools only to the extent they help do the work. So don’t take any of them as commandments we are required to live by, and don’t get too locked into how any of them conceptualize and frame digital preservation problems.” (Owens, 80.) Frameworks are great for guidance, but each institution needs to develop policies and practices that work best for them, their collections, and their users.

Okay, that’s great– but what do I actually need to do to start digital preservation, what should I be thinking about? Thankfully, Owens provides some guidelines and points us to the National Digital Stewardship Alliance’s Levels of Digital Preservation (NDSA LoDP) to break those processes down into digestible steps. Owens outlines four areas of digital preservation that institutions should consider when initiating preservation projects. First, preservation intent and collection development policy– as an institution, what do we want to save? What do we want to avoid? How does that reflect our mission? Second, managing copies and formats– we need systems to ensure bit preservation and the long term useability of content. Third, arranging and describing– how are we organizing our content? What terms will we use to describe it? What kind of metadata do we want to record? Finally, multimodal use and access– what formats will we make our content available in? How will we ensure that this is accessible to users? These questions can help institutions conceptualize why and how they should approach digital preservation, and with this knowledge in mind, they can utilize frameworks like LoDP more effectively, creating policies and practices tailored to their specific needs and capabilities.

Okay, so digital preservation doesn’t need to be an “all or nothing” endeavor, not everyone can or should attempt “four star” digital preservation right off the bat. All digital preservation programs have to start somewhere, and the NDSA’s LoDP was specifically written to be “of maximum utility to those institutions unsure of how to begin a digital preservation program.” (NDSA, 2). This framework explains digital preservation in non-technical terms and breaks down five different content areas (or elements) of digital preservation that institutions should focus upon: Storage and Geographic Location, File Fixity and Data Integrity, Information Security, Metadata, and File Formats. By labeling and listing out these elements of digital preservation, LoDP helps institutions begin to conceptualize what resources and policy should be developed to support a digital preservation program. Within that, there are also four progressive levels of quality for each element, which “is intended to allow for flexibility — users can achieve different levels in different content areas according to their unique needs and resources.” (NDSA, 2). The LoDP can help institutions get started, and continues to provide guidance as their digital preservation programs evolve, recognizing that each institution will develop differently. I think it’s super important to drive home the central idea behind this model and behind these other readings– digital preservation needs to be accessible, but also it must be flexible. There is no one way to do digital preservation, not every institution can, will, or needs to preserve things according to best practices or four star quality.

Chudnov’s The Emperor’s New Repository really hammers this idea into our heads that digital preservation programs don’t need to be large or fancy right out the gate to be effective. They advise us to “start with a small collection, minimal staff, and a short timetable, and see what you can learn by building something quickly.” (Chudnov, 3). Really, Chudnov is all about just getting it done and making it accessible to users ASAP– because really, that’s why we’re preserving things in the first place. Adding a fancy new layer of software to manage your digital objects can actually just make things more complicated, it’s okay to just post it on your website and allow users to interact with it that way. (Chudnov, 4). Again, digital preservation is an ongoing, iterative process, and “you’re going to learn so much along the way that the details of whether that tool’s the best long term fit or not are going to become obvious to you as you build up experience loading your content and making it available.” (Chudnov, 3). Over time, you’ll learn about what does and doesn’t work for you, you can make adjustments to policy, adopt new software, and make new decisions about how to store digital content.

Introductory Blog Post

Hi, (I hope I posted this in the right place!)

My name is Maggie McCready and I’m starting my second year in the MLIS program at UMD. I wanted to take this class because I feel like digital preservation isn’t something I know enough about. Most of the archives I have worked in are still focused upon addressing “analog” records, and throw around “digitization” as a buzz word to sound forward thinking but are really only approaching digitization from a very basic standpoint. I want to learn how to actually manage digital and born digital records, and hopefully this is the right class to take for that.

 

With regard to the readings, I was super psyched to have such an approachable and straightforward introduction to the topic. I’m not very technically minded, so I’m glad the readings started off by introducing more of the concepts and issues related to digital preservation.  Prof. Owens’ 12th axiom of digital preservation (I think #12?) really hit the nail on the head for me, that “highly technical definitions of digital preservation are complicit in silencing the past”– using opaque and overly technical language to describe this really necessary process is exclusionary, making digital preservation seem like this complex and lofty goal that no small institution can ever achieve. I appreciate that Professor Owens has stated this so plainly in our first reading and I’m hopeful that means this course will be taught with this in mind.

The following readings that covered the Digital Dark age reminded me a lot of discussions I’ve heard about the “period of catastrophic loss” projected to happen within the next ten years for magnetic audiovisual materials. I completely disagree with the article by Lyons “There Will Be No Digital Dark Age”, like sure, dude, archivists are aware that we need to save this stuff and sure we’re trying but the digital dark age is a very real thing that’s already happening.  I think Kuny and Tansey address it well, recognizing that institutions can and do interfere with the preservation of their records in order to control the way they are remembered, sometimes purposefully deleting records, but also there is a general disinterest amongst records creators with digital preservation. Obsolescence is also a huge issue, with new formats for storage being produced constantly, we are already having problems with accessing files on floppy discs, Jazz drives, etc.

I did appreciate Vint Cerf and Kuny’s argument for preserving the digital environment or the software in addition to a digital object in order to ensure that the file can still be read and used in its intended way. That was something I hadn’t really thought about before, because my experience with digitization so far has been with relatively simple digital objects like scanned images. Preserving complex digital objects like programs, videogames, or websites really interests me, I feel like there’s so much involved with that process and I want to understand how institutions like the internet archive, MITH, or the LGBTQ Video Game Archive are approaching preserving videogames.  I think this area in particular is something I’m really interested in.