Reflecting on Digital Preservation is Still Digital Preservation

A teacher friend of mine once said that anyone can teach anything if they stay one chapter ahead of the class. My only real teaching experience has been as a native speaker teaching English to non-native speakers; a knowledge gap that gave me a lot of wiggle room, even with my more advanced students. My friend wasn’t talking about digital preservation and this project wasn’t exactly teaching, but you can see where I’m headed….

I’ve been fortunate enough to be partnered with two dedicated women at WheatonArts and Cultural Center who are only about a chapter behind me in digital preservation. It’s been fun and challenging. I won’t use the past tense because we’re still in communication and I hope to follow their preservation efforts for as long as they’ll put up with me.

One of the more satisfying aspects of digital preservation, if you’re partial to theory, is the way theory and practice are interwoven in this work. When I sat down with the staff at WheatonArts, they didn’t just want to talk about bit-level preservation…. When an artist says, “don’t digitize my work; don’t preserve it,” what’s a responsible curator to do? Does suggesting a more sustainable file format to artists constitute interference with the creative process? WheatonArts’ mission is to draw visitors into an exploration of creativity. So far, artists guide much of that exploration by providing their own documentation of their creative processes. Resources notwithstanding, should the Center take a more active role in that documentation process? How do you represent creativity faithfully? Is a video good enough? Our early readings in this course, like Documenting Dance, ask these kinds of questions.

But what about the bits? Shouldn’t we talk about storage first? Toggling between philosophy and hurricane recovery is the joy of this work. There’s real immediate work to be done but you’re never too far from epistemology or aesthetics, either.

Museums welcome not only the opportunity but the responsibility to think through these puzzles. And it is a responsibility. Enduring (and expanded) access will mean that eventually, most visitors to museums and other cultural heritage institutions will never step foot in their buildings. Visitors won’t have the chance to reckon with an analog artifact. As objects are increasingly born-digital, the idea of faithfulness, of whether an analog object could be reassembled, will drift further and further to the backs of visitors’ minds. So, part of the job of digital preservation must be to ask these same questions that thoughtful curators have always asked.

WheatonArts and Cultural Center’s Digital Preservation Project Report

WheatonArts and Cultural Center’s Digital Preservation Project: Policy

Introduction

WheatonArts and Cultural Center (WACC) doesn’t have a digital preservation policy apart from its general collections policy. Collecting, creating and preserving digital objects is intrinsic to the Center’s mission to “engage artists and audiences in an evolving exploration of creativity.” Patrons’ rising expectations of institutions’ digital presence mean that even well-established, and beloved, institutions must take digital collections and preservation seriously, in order to maintain, much less expand their reputations. Here’s a policy option for how WheatonArts might translate staff enthusiasm and commitment into consistent preservation action.

1) Responsibilities

The curatorial staff have expressed their commitment to and knowledge of digital preservation. This should be written into the formal job description of the Curatorial Assistant who has been digitizing the WACC archives. This change should include a time commitment of at least 4 hours per week, with flexibility to account for the current arrearage. Furthermore, the Curatorial Assistant should be entrusted with training and supervising interns and volunteers in digitization.

Administration should also guarantee technical support to the curatorial team. Information Technology staff (IT) should be responsible for reporting on backup schedules and details from the WheatonArts server to the NAS and cloud. Furthermore, IT must establish and maintain fixity information for all stored data on the server and the cloud and determine that all data is backed up and what files can subsequently be safely deleted from the server, while assuring that PastPerfect remains fully functional.* Curatorial staff should assist IT by regularly exporting and spot-checking files from the server, NAS, and if possible, cloud storage, for integrity.

*There is approximately 250 GB of potentially redundant data on the server at present. This must be addressed if digitization efforts are to continue in the near term.

2) Storage

The curatorial staff will be responsible for saving one copy of all preservation and access files to a portable hard drive and partnering with another institution (in a different geographic area with different disaster risks) for a data “swap,” wherein each institution agrees to secure the other’s hard drive against catastrophe at home. This would ideally be a reciprocal arrangement, but it needn’t be. Drives should be replaced, and files transferred to new drives, at least every two years.

If this is accomplished, along with maintaining copies on the server (for PastPerfect), the NAS and cloud backup, WheatonArts will have a minimum of four copies; not all collocated and not all in the same geographic area. If the registrar continues to maintain copies on her office desktop hard drive and on a portable hard drive that she takes home, that’s six copies and more than enough. Two sets of copies could safely be eliminated; perhaps the NAS copies and the registrar’s portable drive.

3) Integrity

IT will be responsible for running checksums and reporting on file fixity (or ensuring that this is done) on a schedule, on the server and in the cloud (see Responsibilities above). This information will be available to curatorial staff who will be responsible for establishing and maintaining file fixity information for the copies in their care (e.g., the registrar’s desktop hard drive and the portable drive to be swapped). In order to do this, staff will download and run AVP’s Fixity Utility.

4) Security

Curatorial staff and IT will limit personnel with edit/delete access to files as much as possible. Actions on files will be tracked and logged from creation (or ingest).

5) Metadata

Staff will continue to maintain inventories of digital objects and storage locations along with metadata not captured in PastPerfect. This will include fixity information and actions on files (see Integrity and Security above). These inventories should be backed up like the files they describe, with one copy included on the swapped hard drive.

6) Digitization and Formats

While WheatonArts cannot dictate file formats to artists, staff will stay educated about best practices and obsolescence risks. Object inventories will include file formats.

Currently, staff creates scans of archival materials at 300 PPI. Preservation and access files are TIFFs and JPEGs, respectively, sustainable practice for the foreseeable future.

Conclusion

A policy like the one outlined here could be implemented rather quickly without expanded resources. In fact, it calls for some contraction (at least in the case of backup copies). Efforts are well under way at WheatonArts to collect and preserve digital objects. What is required of any institution is a paradigm shift from viewing organized digital preservation as a wishlist item to viewing it as a necessity. This policy draft, beginning with Responsibilities, suggests a way to codify that shift.

WheatonArts and Cultural Center’s Digital Preservation Project: Next Steps

WheatonArts and Cultural Center’s (WACC) digital collection is wide-ranging in terms of content and origin. Digital photographs of over 15,000 3D artifacts in the permanent collection of the Museum of American Glass have been taken by volunteers and interns on their own equipment. Born-digital artworks are being accessioned now as well. There is 165 GB of digitized video in the collection already and artist fellows are encouraged to provide documentation (often video) of their creative process, which can be born-digital or analog, according to the artist’s preference. Meanwhile, the staff has been digitizing photographs, negatives and other papers in the WheatonArts archives. What appears at first glance to be a wild and wooly collection is in fact quite organized, thanks to the hard work of a curatorial team who appreciate WACC’s digital materials as worthy of collection status.

With an administration that is keen to realize the benefits of a well-managed and preserved digital collection and an enthusiastic staff, WheatonArts is poised to move forward. Here’s how they could proceed….

The five general categories of the National Digital Stewardship Alliance (NDSA) Levels of Digital Preservation are 1) Storage and Geographic Location, 2) File Fixity and Data Integrity, 3) Information Security, 4) Metadata and 5) File Formats. For each category, institutions can improve their preservation program by advancing through Levels 1 to 4. The levels are 1) Protect your data, 2) Know your data, 3) Monitor your data and 4) Repair your data. It is inevitable that organizations will find themselves at differing levels for each category. How is WheatonArts doing so far?

Well, on the first and most urgent category, storage and geographic location, WheatonArts is approaching Level 2. WACC currently stores at least four copies of preservation and access files and that’s really going above and beyond what’s required of even the highest level. So actually, there’s some redundancy that could be eliminated. But, while one copy is not always collocated, by virtue of being taken home by a member of staff on a regular basis, none is in a different geographic region with different disaster threats. A clear next step is to correct that, and a simple solution is to swap hard drives with a buddy institution. Staff have suggested the Museum of Glass in Tacoma, Washington. The relationship doesn’t have to be reciprocal for it to work just as well for WheatonArts but ideally, arrangements like these provide further occasion for dialogue between two institutions with overlapping missions. As for the cost, beyond the hard drive and secure shipping, it’s non-existent for either organization. An aggressive replacement schedule of every two years or so would safeguard against bit rot.

Cloud storage options, like Amazon’s, have been discussed, but after further discussion with WheatonArts IT, it turns out that cloud backups of data on the server are already happening. Assuming that orderly and regular backups to a cloud are occurring, and that the curatorial staff will have more control over that process in the future, geographic concerns might be settled without the buddy system. In the short term, however, and for the opportunity for intermural dialogue on digital preservation, I’d still recommend the swap.

Artist requirements and rights considerations (in addition to demands on resources) create obstacles to transferring all WACC’s digital material to one storage system, but within limits, they intend to do so. Furthermore, this project is an occasion to better understand and document that system. WheatonArts can achieve Level 2 on storage and geographic Location.

The staff has made a start on the second category, file fixity and data integrity, by habitually counting files for agreement across multiple storage locations. An easy (and free) tool for checking this more deeply and systematically is AVP’s Fixity Utility. WheatonArts can download the software for their operating system and run it against any hard drive in their possession. Until server/cloud backups are better understood by staff, establishing fixity on local storage should be a good enough next step. It might also be a good idea to try exporting some files from the server to see how straightforward that is and to do some spot checks against fixity information for the same file backed up elsewhere. WACC can start working toward Level 2 by checking fixity on ingest (or creation) for all digital files once they’ve downloaded the AVP utility.

There is no reason, with the cooperation of technical staff, why WheatonArts can’t eventually achieve Level 4 for the third NDSA category, information security. Access is limited to the curatorial and IT staff and the latter should be able to advise on how to restrict unauthorized access and to efficiently track and log who performed what actions and when. The next step on information security is probably to make that an item on the agenda for the next meeting with technical support.

With “a breadcrumb trail of our digital image and multimedia files through all the various storage places since we started having digital image and multimedia files to store,” WheatonArts has worked hard to inventory and connect files to their metadata in PastPerfect, the collections database since 2014. Since much of the metadata required for the higher levels in the NDSA model is generated automatically with digital objects and we’ve discussed adding fixity information to that array already, the biggest challenge for WACC will be logging any metadata not captured by PastPerfect. The very next step, however, will be to make sure that the inventory of objects and storage locations is up-to-date and that the inventory itself is backed up like the data it describes. One backup should be to that buddy’s hard drive. This will assure that WACC can have the requirements of the first level satisfied.

The staff has already reached level 2 on file formats. While they can’t dictate formats to artists, they have a limited set of formats they actively preserve, TIFFs and JPEGs for digital photos, for example, and they have a current inventory of files and their formats. The next step here is for staff to stay educated about obsolescence risks for the formats they’re preserving. That would place the program at level 3 and while unlikely with the formats in question, migration en masse to newer or more sustainable formats, could then be performed if necessary.

WheatonArts does not have a digital collections policy. The curatorial staff recognize their responsibility to care for and interpret this new type of collection for their patrons. They have made significant progress on digital preservation. There’s plenty room to grow, but the relatively simple recommendations outlined here, in accordance with NDSA guidance, should stabilize the program, help staff secure broader support for their efforts within their organization and provide focus and credibility for grant applications.

WheatonArts and Cultural Center’s Digital Preservation Project: A Solid Start

Introduction to WheatonArts

WheatonArts and Cultural Center (WACC) in Millville, New Jersey, was founded as Wheaton Village in 1968. The 45-acre campus is home to the Museum of American Glass, the Creative Glass fellowship program for artists and the Down Jersey Folklife Program. WheatonArts’ mission is to “engage artists and audiences in an evolving exploration of creativity.” The mission is “advanced through the interpretation of collections and exhibitions; education initiatives and culturally diverse public programs; residencies and other opportunities for artists.” It’s unsurprising that WheatonArts would be eager to partner in a project to improve and advance their digital preservation program. The Center’s underlying vision is to make creativity more approachable and accessible. Digital preservation ensures that collections are interpretable long term and available to more students, scholars and of course, artists.

The (Digital) Collection

The Museum of American Glass (MAG) at WheatonArts presents the history of American glass from the colonial period forward and much of its contemporary glass art is the work of artists during their time as fellows in WACC’s Creative Glass fellowship program. Included in the collection is documentation of their process. The permanent collection of 3D artifacts exceeds 22,500 pieces, of which over three quarters have been photographed and saved as preservation and access files.

So far, digitized artifacts (objects) account for nearly 5 GB of content. Through digitization of photographs, negatives and other papers (photos and archives in the parlance of PastPerfect), the staff has produced an additional 33 GB of digital content. Nearly half the photographs and negatives have been digitized but together with other records, the Museum has processed 11 linear feet of physical archives that are currently being digitized. There are also 116 digitized videos which, at 165 GB, claim the lion’s share of the museum’s current storage needs. That’s over 200 GB created through in-house digitization and growing. These figures do not account for born-digital artworks and documentation of creative work being done at the Center.

The Museum uses PastPerfect as its collection database and digital objects are stored there on WACC’s server and backed up to an NAS hard drive. Objects are also stored on the registrar’s office hard drive and an additional hard drive she takes home. That’s four copies in different locations in the same geographic region. Space issues on the server have created backup failures between PastPerfect and the NAS, but it’s still unclear what types of files are backed up from PastPerfect when things run smoothly. Hopefully some of these questions can be answered in a future meeting with software support staff and WheatonArts IT. Whether, as part of institutional storage (on the server), file fixity information is being generated needs to be determined. Prior to storing files in PastPerfect, objects were kept on multiple media and platforms and staff have made heroic efforts to unify the collection. Only a limited number of museum staff have access to the digital objects; four, excluding IT.

What’s left to be collected?

Most preservation projects involve a bit of catch-up on digitization and MAG is no exception. The museum will want to represent its permanent collection of 3D objects and make them more findable but there’s a world in the archives that could set the digital collection apart and bring WheatonArts attention from more disparate groups than it already enjoys.

One intriguing aspect of WACC’s mission that informs its collection policy is the accumulation of documentary evidence of creative work done by artist fellows. Making this material cohere in a digital collection will be the digital surrogate for the approach and access to creativity that a visit to the studios or participation in the onsite educational programs offered by the Center is meant to provide.

A member of staff suggested that the “real work” will begin when they’re ready to tackle the attic trove. In addition to the Center’s history, the undigitized material in storage documents the history of glass art and craft and pottery in New Jersey. In addition to the Wheaton Glass Company—WheatonArts is its namesake—archives, the Whitall Tatum, Stangl and Fulper archives are all in the care of WACC. A significant aspect of the labor, economic and cultural history of the state can be understood through this collection. Books, maps and surveys of Cumberland County and the City of Millville in the Center’s collection will be a boon to local historians and genealogists but haven’t been inventoried due to lack of resources for such an undertaking.

One thing at a time, but it’s clear that eventually, WheatonArts can expand its reach with these records and have a digital collection that not only represents the objects in the Museum’s physical collection, but through the inclusion of these other objects, actually complements that collection. That’s significant. In my experience with two organizations and their digital collections, neither does much to truly expand the profile of the institution and bring a new type of user into the community. Their collections, while exciting, are too circumscribed for that. WACC has a real opportunity here.

Moving Forward

Resources are limited, as they are most everywhere, but there is enthusiasm for a strong digital preservation program at WheatonArts. The curatorial staff commenced digitization with best practices in mind and have a good working inventory of their digital collection. They are willing to commit regular staff to the work for at least a few hours a week (including, perhaps, some job re-description) but can rely on volunteers and unpaid interns as well. Furthermore, leadership is keen to realize the potential benefits; social media outreach, for example.

The staff are humble about their efforts but if we attempt to map their program, so far, onto the National Digital Stewardship Alliance (NDSA) levels of digital preservation, we see that WheatonArts is already engaging with all five categories of Level 1 and at least the File Formats requirements of Level 2. That’s a great place to start and proof of their commitment but somewhat uneven, because that’s the nature of the beast. Digital preservation begins, at best, with a well-reasoned game of catch-up. To make digital preservation truly sustainable at WACC, we’ll be looking to advance the program with concrete storage and file integrity options for budgeting purposes.

Digital preservation is a big ball of wax and the staff recognize that. Here are some of the “grey areas” they described to me: “ownership/control, sharing and access, whose digital space is being used, how information should/could be organized, clarifying copyright, etc.” In my museum job, I encounter the same concerns and they cover the spectrum of digital preservation/curation priorities. Hopefully this project will help sharpen the contours of those issues at WheatonArts and even resolve a few.

Digital Preservation from Both Sides

I haven’t done much digital preservation. I’ve been in the bit trenches. The work I’ve done is to digital preservation as community sandbagging is to the Army Corps of Engineers. I need to reword my resume.

Bit preservation is our most urgent set of tasks; managing multiple copies, managing and using fixity information and ensuring our data is secure. All these activities are directed at long term usability, but digital preservation is broader. It’s concerned with the future viability of file formats and software; with future renderability. Having done the basic bit level work, we might consider migrating files en masse to more sustainable formats or “leave the bits alone” by emulating or virtualizing earlier computing environments.

Our digital preservation decisions will not be identical across institutions. Best practices are like recipes; they’re frameworks. “Approaches to copies and formats should fit the overall contours of your institution’s collecting mission and resources. This is about the digital infrastructure you establish to enable the more tailored work that is done around individual collections” (Owens, p. 105). A video art or game collecting museum might attempt emulation to preserve as much of the experience of the work as possible; seeing the artifactual aspect as more intrinsic than the informational to its mission. But if those bits aren’t safe, these considerations will never arise. Sandbag first.

Much of our study this week is focused on fixity and storage. In my bit preservation experience, fixity checks can get lost in the mix. 80% of NDSA member organizations reported that they use some sort of fixity checking. I think that members and non-members alike have mostly heard the urgent call to action to get the bits off the floor and maybe while they’re at it, make multiple copies. Then, of course, they have to store them somewhere, so they’re forced to make storage decisions. But I’m not so sure that organizations often understand the necessity of maintaining bit-level integrity and how they’d go about it. Then again, they could be taking it for granted that their storage solution is a fixity solution as well. And it might be, but I think that’s something of an afterthought.

We’re going to talk about access later in this course, but my impression has been that access, as a buzzword, can cloud our perspective on preservation. I’m concerned that when that scan hits the web, it’s tempting to feel that our preservation work is done. We’d never take that approach to analog media. We wouldn’t hang a painting in a gallery, throw our hammer in the truck and head home. Well we might, but we’d be invested in maintaining the integrity of that work for future shows. Accessibility now isn’t access. This might be obvious to us, but just this week I was speaking to someone about born digital material and they asked me if I was also interested in endangered media. It’s still a hard sell.

That brings me to one final thought I had while reading the case studies in the POWRR group white paper. The experience at Chicago State University was illuminating. “The defining moment when several library staff members recognized the importance of digital preservation activities occurred when they realized that grant activities digitizing library collections included no provision for storage or preservation” (p. 21). That might be because the grants themselves don’t allow for appropriate storage solutions. I was investigating grants for an in-house digitization project I was working on and had determined that cloud storage was the best, and most affordable, offsite storage solution for my organization. Once I found a grant that wouldn’t exclude in-house digitization projects, I realized it had seemingly arbitrary restrictions excluding “subscription-based” or simply, “cloud” storage.

Our reading this week helped me recontextualize my own work as a novice in this field. I wonder if anyone else had a similar experience.