I Find Your Lack of Faith… Well, Prudent.

Join the Dark Side

I couldn’t help feeling that the underlying subtext to this week’s readings was an embracing of distrust and uncertainty. Distrust in physical media, file formats, third-party cloud services… even the computer’s ability to do what many take for granted: create an exact copy. Uncertainty manifested itself in issues such as the future adoption levels of formats, the continuity of tools, and even the motives and competency of our own staff. Rather than being dismayed by this somewhat dour outlook, I found it to be a heartening confirmation of my belief that pessimism can indeed be used as a force for good.

I guess I’m weird like that.

Owens Chapter 6 kicked off this theme of distrust with the recurring phrase “hedge your bets.” This one phrase was applied repeatedly to the first of three core processes for bit preservation: 1) creating and managing multiple copies, 2) managing and using fixity information, and 3) establishing information security protocols to keep accidents and attacks from occurring at the hands of staff or users. In the context of the first process – managing multiple copies – the “hedge your bets” approach necessarily results in a proliferation of file types, storage media, and a geographically sprawling network of storage locations. The point of this push for diversity being that no one disaster, bad actor, or system failure is likely to wipe out all copies.

The distrust also extended to seemingly mundane processes like the act of transferring data, and minimizing the number of people capable of accessing the objects. But the issue that interested me most was the emphasis to not put too much faith in any one tool. As Owens notes, vendor lock-in is a real concern that necessitates forming an exit strategy before acquisition is even complete (p. 115). I have seen this happen in my own career and know how dangerous it can be. Indeed, it was one of the catalysts that inspired me to seek this degree.

The theme of distrust continued in the NDSA Storage Report. This survey found that the majority of NDSA members’ desire for control over their own holdings tended to dissuade them from embracing commercial cloud services. The perception (or reality) of greater control of their data caused the majority to prefer joining institutional cooperatives where each member shares their data with a member organization in order to establish geographic diversity in their storage plan. Of particular concern among the group was the lack of transparency in the fixity checks performed behind the scenes by commercial cloud services. There was no proof offered that hashes provided at the time of access weren’t simply being replayed from the time of upload and thus providing a false sense of safety.

Again, I was struck by how issues of uncertainty and distrust could be harnessed to realize positive and productive ends. Perhaps I’ve finally found my people?

A New Hope

Not all the readings were gloom and doom. “From Theory to Action” in particular revisited many of the themes we’ve touched on in previous weeks emphasizing a simple and incremental approach to beginning a preservation program. As the subtitle of the piece indicates, they emphasize embracing the concept of “good enough,” and then building on it. Digital preservation is not a binary status requiring that an institution either be moving at light speed or standing completely still. Institutions should focus on near term goals that can immediately improve preservation, however small and simple they might be. But probably the biggest takeaway from this piece was the degree of confidence and self-efficacy the POWRR group members instilled in each other simply by choosing to assess and tackle their collective issues in a cooperative fashion. The creation of communities of practice is particularly effective at helping the entire group identify solutions to common problems.

Discussion questions

In Chapter 6, Owens notes the importance of differentiating working files from files that have become static and thus ready for long-term storage. I have found that in practice this is more difficult than it would seem, particularly for video. In our multimedia department the concept of finality has been elusive at best, to the point that our manager gets angry if a file is actually labeled “final”, because it becomes untrue almost the moment it’s saved. Our company’s trend of editing-by-committee basically guarantees at least a few more rounds edits no matter what. Even an extended passage of time is no indication of finality. Customers will come back and ask for changes to a video many years after the first version, usually because they want to remove a staff member that has left, or change someone’s job title. Saving an editable version requires saving the non-linear editor project files and all associated source files. This is the most complicated version to save and the first to become obsolete. So, my question for the class is how should we as archivists respond to such a dynamic situation, where the concept of finality is tenuous and fluid?

And lastly, I didn’t discuss Dietrich’s “Emulation for Everyone” above because it seemed like something of an outlier relative to the others. I find myself fascinated with emulation as a practice, but wondering about its feasibility for all but the most extreme cases. For example, it was mentioned at the end of this piece that researchers looking at the Jeremy Blake Papers actually preferred using the modern OS and were really primarily interested in the informational value of the objects. Authenticity and fidelity were less of a priority. This seems like a lot of effort to have gone to for an experience that no one really needed. So my question for the class is, where do you see emulation fitting into the list of preservation options? Should it require a more rigorous examination of preservation intent to make sure the considerable extra effort is justified?

I’m also curious to what extent an emulated system becomes a digital object in itself, which then becomes exponentially more complicated to preserve? At what point do we decide that these towering platform stacks held together with scotch tape and shoe goo are no longer worth the expense of maintaining?

 

I formally apologize* for all Star Wars references.

*Not really.

7 Replies to “I Find Your Lack of Faith… Well, Prudent.”

  1. “…how should we as archivists respond to such a dynamic situation, where the concept of finality is tenuous and fluid?”

    Hi Andy. This question prompted me to think of two scenarios.

    I have often thought of the fluidity of digital objects as a subscriber to digital newspapers. Different users can read the same article at different times of day and come away with different content. The time stamp is supposed to indicate that something has changed but there’s usually no way to figure out what has changed. For example, the policy page for the Washington Post indicates that users should approach news articles as if they’re developing stories, but they do not need to document their changes unless they are correcting a “significant mistake” or “substantively correcting an article, photo caption, headline, graphic, video or other material.” See “Updating a digital report” and “Corrections”:

    https://www.washingtonpost.com/news/ask-the-post/wp/2016/01/01/policies-and-standards/?utm_term=.1a2130dd672b

    I think updates without documented corrections are so much the norm now that newspapers are running the risk of misleading readers. After all, who goes back to reread an article looking for differences? If you look at the source code, you can see the time stamp for when something is published versus modified, but this isn’t visible on the web page. An article published yesterday with an update from an hour ago looks like it was just published an hour ago.

    All this is to say that, as a subscriber, I would appreciate more transparency in the lifecycle of a digital article once it is made public. However, in the policy statement from above, The Post suggests that this is what subscribers expect from digital content. If this is the expectation, then it encourages rapid reporting and editing that most readers will not be able to keep up with. To me, this is a strong case for having versioning in the articles to document changes as they transpire. To be fair, I understand how that transparency can prompt a lot of questions – some possibly more nitpicky than others – but I still think there should be a user-friendly way to navigate through the changes.

    Alternatively, I also thought about library catalogue records that undergo revisions. My only experience with looking at bibliographic records is with Voyager ILS, and while it can document when a change was made and by whom, the details of those changes would need to be recorded by another means. This information would not be available to users, and in theory, could alter search results. So far as I know, this has been the norm and the transparency could end up causing more confusion to the user experience. Some of you more familiar with cataloging may disagree.

    Using the old fallback, I guess it depends. I’m not sure of the particulars in your example but I think your scenario would affect a few decisions – the frequency that you update your copies, whether you would want to overwrite older versions or save all of them, whether it would be more cost effective to just redo the video as opposed to saving the project files that you might not be able to open when needed. I wonder if it would be worthwhile to not make the project so open-ended and just indicate to the customer that, after a certain point, a new version is just a new video.

    I also just wanted to say that I appreciate your pessimism. I think it’s important to look at limitations to put the work in context and to strive to make things better. I think the problem sometimes comes down to selling the value of your department. If you talk about the negatives too much, then people who don’t spend all their time thinking about this stuff think you just can’t do the work. Alternatively, they don’t have time for anything more than a nutshell explanation which can oversimplify the process.

  2. I, for one, thoroughly enjoyed all of your Star Wars references. I was actually watching The Empire Strikes Back a few days ago and thought about this class. I still have a VHS copy from 1995 and the tv I have doesn’t know how to play it properly so it kept flipping the aspect ratio. It made me think about dependencies in a digital environment. The tape still plays for now but something has already been lost in the viewing and who knows a few years from now it might be completely unwatchable. But I digress.

    I agree with your assessment that a certain amount of pessimism is necessary in this line of work. Part of preservation work, for any media I think, means thinking realistically about the worst that can happen so you can build redundancies into your policies and plans to account for disaster. Thus, a certain level of distrust in specific tools or vendors is healthy to avoid complacency. However, I was surprised by the NDSA report and the level of mistrust they had for commercial storage systems. This report was released in 2011 so I wonder if things have changed since then as these services become more mainstream?

  3. I think we can all relate to your pessimism! I mean, isn’t the base mission of every sort of repository “we’re all gonna die so we need to make sure that stuff exists so people understand what we did?”

    In response to your question about emulation, I see it as part of the Access section of that rainbow continuum in the “Good Enough Digital Preservation” article, rather than another type of digital object (though I can almost see how it could be!). Emulation is not completely necessary, but depending on the access required by a repository’s designated community, it can be helpful. You were right in that the collection at NYU wasn’t really served by the emulation, but that’s one collection at one repository. It’s entirely possible that they’ll have a collection that will require emulation in the future and they already have experience with it. I became disenchanted with emulation after reading the article and seeing all the problems that could happen. I didn’t know that emulation existed until this class, so it is really only a matter of time before I realized it was the perfect solution I created in my mind.

  4. @ Gwen: I too wondered about how acceptance levels of cloud services may have changed since 2011. Seven years is a long time. I also wondered if in the interim any cloud services had begun to implement remedies to any of the issues outlined in the NDSA report as points of dissatisfaction. Specifically, the desire for greater insight and control of any built-in fixity checks. Despite my pessimism I want to believe they would eventually address this need… shhh, don’t tell anyone I had a happy thought.

    @ Margaret Rose: The more I think about it emulation almost feels like building a theme park, similar to New Salem. They are almost a fabrication in a sense, yet the goal is to get as close as possible to the original experience despite the smoke and mirrors. I’m thinking of things like the CRT simulator that attempts to visually degrade a video game that is rendered too perfectly on an LCD. It’s just one more tiny lie told for the sake of painting a larger truth.

    And I agree that the complexity of them feels a little daunting. One question that popped into my head was what kind of cyber security vulnerabilities might be introduced by resurrecting an operating system that is no longer patched. I’m sure there are remedies to that, but they’d probably require an IT specialist that the “Lone Arranger” at a tiny institution may not have access to. It makes me wonder if emulation is exclusively the realm of large institutions.

  5. @Tina: Not to digress too far, but I absolutely love your idea of tracking change histories to establish transparency in news articles. For as much as people make fun of Wikipedia one of the things I appreciate about it is the level of transparency they attempt to maintain. News organizations could easily use similar mechanisms to make change histories available without hurting the readability of the article. It wouldn’t surprise me if many news organizations wouldn’t want to. And it wouldn’t surprise me if most readers don’t even bother to avail themselves of it. But in an age of “alternative facts”, when journalists are fighting for every last shred of credibility, I think such a gesture would be a powerful statement that they wish to hold themselves accountable.

  6. I also am a Star Wars fan, so keep the references coming.

    It seems like in your last two paragraphs you are arguing that emulators take a lot of time and effort and maybe aren’t really worth it except in special cases, but that wasn’t the lesson I took from the “How to Party Like It’s 1999” article. I thought the most useful part was the Figure 5 table, which gave a list of pre-built emulators that could be used to reproduced “extinct” computer systems like old Macintoshes, DOS, or Commodore Operating Systems. By using pre-built emulators, that eliminates a lot of the work that would go into creating them, and it seems like at least some (possibly all?) are open source and can be updated as needed so that there’s won’t be a constant need to reinvent the wheel.

    To address your idea that different emulators could need to be preserved and the whole thing could become “towering platform stacks held together with scotch tape and shoe goo,” I don’t really foresee that as a problem if the goal is just to recreate a previous operating system. Sure, a DOS emulator will likely be different ten years from now to be compatible with more sophisticated operating systems than we have now, but I see emulators more as tools for extracting digital objects rather than digital objects that themselves need to be preserved. (Although I expect at least someone in this class might be able to come up with a good example to contradict me.)

    Now granted, I am still a little bit unsure of how the emulators mentioned in the article work and what their capabilities are, but I can envision scenarios in which they might be used often. For example, I currently own at least one external hard drive that I purchased over ten years ago, and for whatever reason, it is no longer recognized by the operating system I have now. It would be nice to have a tool that could “turn back the clock” ten years on operating systems and allow me to access the files on that hard drive. I imagine there are probably lots of other media items that could benefit from turning the clock back even further, maybe fifteen, twenty, or twenty five years. If the emulators don’t work like that….well, someone should invent ones that do.

    1. Tracee, my apologies for not responding to your comment earlier. For some reason I just now received an email notifying me of your response, but I see you left it two weeks ago… weird.

      You raise excellent points.

      I’m entirely willing to concede that all the examples we’ve discussed outlining very complex emulations (video games, iPad apps, etc.) may have colored my view on the feasibility of emulation. But you’re absolutely right… those complex situations are the exception, not the rule. There are far more simple and ubiquitous objects to rescue. Enough to make emulation-as-a-service for commonly needed environments a reasonable solution. I’m sure I’ll be super grateful for them when I finally get around to migrating all of my Quark files from 1998 (*cries a little on the inside).

Leave a Reply

Your email address will not be published. Required fields are marked *