End of the Semester Reflection & Draft of the Final Report for the College Park Aviation Museum

***The report included in this post should be considered a draft until it has been reviewed by the institution. The post will be updated if edits are made.****

At the end of the semester, here are three ideas that I am going to take away:

To paraphrase Dr. Owens, digital objects are not preserved, they are being preserved. This came from one of the chapters assigned at the beginning of the class, but it has stuck in my mind ever since because it’s such a pithy way to sum up one of the most important ideas in digital preservation.  There is no “one and done” method of preserving digital objects, but rather it’s an iterative process in which we need to always stay vigilant. This means following established procedures, updating inventories, checking file fixity, monitoring obsolescence, ensuring that files are backed up, and other related tasks. It also reminds me of an episode of The Keepers podcast that interviewed the Internet Archive’s Jason Scott. The interviewer asked him something like “How do you know that the file formats you are choosing are going still be around and be accessible?” and he said (paraphrasing), “We don’t, and that’s kind of the point.” In other words, an important part of digital preservation will be monitoring evolving technologies and trying to keep up.

My second take-away is that you do not need a computer science degree or advanced technical knowledge to do digital preservation. It seems like most people in the class (including myself) have backgrounds in the humanities and do not consider ourselves to be IT experts, so we approached the class with hesitations because we didn’t really believe that we were qualified to do the complicated, highly specialized work we imagined would be necessary. When I was meeting with the curator of my organization, I remember constantly apologizing for not being an expert, and because of my own insecurities, I named-dropped Dr. Owens repeatedly as a way of saying “at least I’m associated with this other person who knows what they are doing.” But now that we are at the end of the class, I believe that we all overestimated the difficulty level of digital preservation and didn’t give ourselves enough credit. The basic steps of digital preservation are simple and can be learned by almost anyone. There is no reason to shudder in fear just because someone said “checksums” or “file fixity.”

On a related note, I told a friend of mine about this project, and he works for Google and has degrees in computer science and advanced mathematics. When I told him I felt underqualified, he said that when he was hired at Google he didn’t know a lot of the technical skills that were necessary for his job, which really surprised me. It wasn’t that he was hired by mistake or was underqualified–Google saw that based on his previous work he was a person capable of learning the new skills that would be necessary. This was kind of a revelation to me, because I think that humanities majors tend to think that people who are IT experts know everything there is to know about computers, but a lot of times they are learning as they go along too. Knowing this gave me more confidence about my own abilities. I am also proud of the final product for this class, and I no longer feel like a digital preservation “imposter.”

A third takeaway from this class is that digital preservation can take multiple forms. I did not expect to start this class with philosophical discussions of what it means to say that something is the “same” as something else, but it is really important to understand what you want to preserve and why before taking any actions. I was intrigued by parallels to the art world, where conservators  have to make decisions based on the artist’s intent, and I enjoyed learning about all the possibilities of simulations, like the recreation of old video games or Salman Rushdie’s laptop.

The draft of my final report is available here:

College Park Aviation Museum Digital Preservation Report

 

Draft of the Digital Preservation Policy for the College Park Aviation Museum

***This document should be considered a draft because it has not been reviewed by the organization. It will be updated if I receive any edits or feedback.****

Purpose

As the College Park Aviation Museum (CPAM) seeks to expand its collections and facilitate greater access to its historic materials, a growing portion of its holdings will be either born-digital or digitized material. This document was created to establish guidelines on how to manage and organize these materials and to protect them against the risks of loss or technological obsolescence. In following these guidelines, the museum will be actively working to ensure that its digital collections continue to be available for future generations.

Scope

This policy covers both born-digital and digitized material. Born-digital refers to material originally created in digital form, such as photos uploaded from a digital camera or text files created using Microsoft Word. Digitized materials refer to any material that has been converted to a digital format, such as scanned photographs or documents.

While many of the museum’s digital files are on the common drive, this policy also encompass material currently stored in other formats such as USB-drives, floppy disks, CDs, and DVD+Rs. Although certain media items like VHS and cassette tapes are not typically put into the same category as digital files, these are “endangered” formats that are obsolete and have limited lifespans, so the best way to ensure ease of access and continued preservation is to convert them to digital files. With the expectation that their conversion will be a priority they should be considered to be within the scope of this policy whenever applicable.

Standards

This policy draws from the  National Digital Stewardship Alliance (NDSA)’s Levels of Digital Preservation. The NDSA Levels were chosen as a basis for recommendations because it provides succinct, clearly-articulated standards and encourages an incremental, scalable approach to digital preservation. It recognizes the need to provide realistic options to institutions with limited time and resources, which is the goal of this policy as well.

Storage

CPAM will consult with their parent organization, the Maryland-National Capital Parks and Planning Commission (MNCPPC)’s Office of the Chief Information Officer (OCIO) to determine what measures are already in place to back up the contents of their common drive and how often these back-ups are performed. CPAM will ensure that there are at least two complete copies of the contents of the common drive and that these copies are not located in the same place.

CPAM will make it a priority to get the images, videos, textual files, and other data that now exists in various media formats onto its common drive, which will simplify its care and management. It will phase out the use of formats like CDs, floppy disks, DVD+Rs, and USB-drives as storage, because those formats have a limited lifespan and are prone to damage or loss.

If CPAM runs out of storage space on its common drive and MNCPPC is unable to provide additional resources, CPAM will continue to use the common drive for administrative files and will use a secondary cloud storage system like Dropbox for media files like photographs, videos, and scans.

CPAM will continue to pursue a partnership with Digital Maryland (DM), which will enable them to continue to expand their digital holdings while also making them more accessible to the public. The files hosted by DM can serve as additional copies for preservation purpose, but CPAM will also keep two copies of all digitized files in its own storage systems.

File Fixity and Data Integrity

CPAM will use a tool like AVP’s Fixity to perform data integrity checks on all its digital files at least once annually, and will also aim to perform checks after large ingests or transfers of files. If the checks indicate that files are missing, unintentionally altered, or corrupted, the files will be restored using one of the museum’s back-up copies.

Information Security

CPAM will identify who has the ability to read, write, move, and delete files and restrict those authorizations when appropriate. Restrictions will be recommended for all historic collections, although there may be a desire for more flexibility with administrative files. These restrictions will be documented on a spreadsheet and periodically reviewed so that changes can be made if necessary.

Metadata

CPAM will create a mastery inventory of CPAM’s digital files that are deemed worthy of preservation, including those on the common drive and on various other formats such as DVD+Rs, CDs, floppy disks, or USB-drives.  CPAM will also inventory all media formats such as VHS tapes or cassettes which are either obsolete or in danger of obsolescence, so that these materials can be prioritized for digitization. This can take the form of a single spreadsheet or multiple spreadsheets that are organized within one folder.

CPAM will chose the level of description most appropriate to the material being described and the amount of resources it is able to devote to the task, which realistically may not always be a file-level description. For the initial inventory, it is sufficient to say “1 CD containing 150 images of CPAM events, circa 2005,” which can be expanded in more detail at a later date if desired. At this stage, it is more important to get a general understanding of the scope and contents of the collections (including file formats) than it is to create detailed file-level metadata

CPAM will agree on naming conventions for all born-digital and digitized files that will be added to its collection in the future. If time and resources allow, it will also standardize its legacy holdings.

File Formats

CPAM will encourage the use of preferred file formats for the creation of new digital materials. Preferred formats are ones that are commonly used, widely accessible, and open-sourced. A guideline for determining what digital preservationists consider to be a preferred formats is the Smithsonian Institution Archives (SIA)’s Recommended Preservation Formats for Electronic Records.

For text documents, spreadsheets, and presentations that are in their final form (and will not be edited), the preferred formats are PDF or PDF/A. For images, the preferred format is TIFF (uncompressed), for audio it is BMF-Broadcast WAV (.wav extension), and for video it is Motion JPEG 2000, MOV, or AVI. For a more complete listing of both preferred formats and acceptable formats, CPAM employees are encouraged to consult the table in the SIA guidelines.

CPAM will create a list of file formats in its collection and monitor each for obsolescence issues. Converting files to the preferred formats is only recommended for files in formats that are not considered acceptable by SIA.

Review

Digital preservation is not something that is done once and then forgotten about, but rather it is an on-going, iterative process. In recognition of this fact, the digital preservation policy should be reviewed annually by staff members of the CPAM and revised when necessary. The museum should anticipate that evolving technology and standards may create the need to amend the policy. It may also need to be adapted in response to changes in the museum’s goals or priorities.

Center on Contemporary Art (CoCA) Next Steps

Organization

The Center on Contemporary Art (CoCA) is a small non-profit art gallery in Seattle, Washington, founded in 1980 with “the intention to foment and create contemporary art in Seattle.” (CoCA Archives Project “About” page) As I wrote in the survey, CoCA’s main preservation issues stem from the fact that they have no regular, paid archives staff, which poses challenges when it comes to what the organization has the time and budget to implement. On the positive side, the digital collections are relatively small, and well-cataloged.

This next steps evaluation will establish danger areas for CoCA’s digital collections and provide suggestions at varying levels of sophistication and resources needed. I measured resources needed mainly by time estimated to complete, as time to do digital preservation is the most precious resource in CoCA’s situation.

A digital preservation policy for CoCA will need to be flexible enough to be practical for a small organization with very limited staff, but provide enough information and sources to aid archival consultants and interns as they try to effectively steward the CoCA Archives Project.

Storage and Geographic Location

NDSA Level: 1

Description: While storage of digital objects is split between a local server maintained by the past board president, local computers, Google Drive, and external media, all digitized objects have copies stored on Google Drive and a hard drive backup. However, there are some floppy disks and other external media that may contain content that has not been transferred to computer/drive/HD backups as of yet.

Low resource recommendation: Identify external media that has not yet been looked at or digitized; for the formats that can be read by CoCA owned machines, download content and add to stable, centralized storage. Make a list (including any information written on the exterior of the floppy disk etc.) of external media types that CoCA does not have a drive for, with an eye towards exploring how these can be read and converted at a later date (potentially as a grant-funded project).

Medium resource recommendation: Explore options of converting external media not able to be read/accessed by CoCA computers. Compile documentation about storage systems, mediums, and locations of all digital objects. Verify that each digital object has a minimum of two copies stored in different locations, and explore possibilities for third copy of objects.

High resource recommendation: Explore another cloud storage option that is not Google Drive, potentially something with version control (especially for the born-digital administrative files and current documentation of exhibitions), such as Dropbox or Box. Potentially create third copy of each digital object to be stored in a different location/medium. Look into establishing a partnership with another organization to store backup files.

File Fixity and Data Integrity

NDSA Level: Below 1

Description: Fixity has not yet been actively addressed. Establishing data integrity will be an important step for CoCA to ensure that their digital files are unchanged. One of CoCA’s Archives Project’s advantages is that their archival consultant are very knowledgeable, and interns also come from the University of Washington’s iSchool. While time and money are challenges for CoCA, the archivist volunteers and interns are creative and often have knowledge of digital preservation.

Low resource recommendation: Create an internal document/spreadsheet that lists file inventory, current location, and file size. Check file sizes by folder every few months or when moving location or storage system to monitor for any changes, which would indicate problems.

Medium and high resource recommendation: Begin to generate MD-5, SHA-1, or SHA-256 cryptographic hashes to generate fixity information for existing and newly created digital objects in the archives. AVP’s Fixity tool is an excellent choice for this, as it is a free service that will email a report on file changes.

Information Security

NDSA Level: Below 1

Description: Access to files is not restricted at this point, and IT assistance is sometimes provided by friends or partners of volunteers and interns, which could lead to security concerns or accidental modification or deletion of files.

Low resource recommendation: Determine who should have access to storage and software of digital collections, and restrict Google Drive, TinyCat, and Weebly logins to that list.

Medium and high resource recommendation: Evaluate computers used by CoCA staff, consultants, and volunteers. What machines are owned by CoCA, and what information is stored on personal computers? Given the nature of the organization, limiting volunteers using personal computers may not be practical, but having a list of who is doing what work on what machine or platform can provide more information to evaluate security risks.

Metadata

NDSA Level: 2-ish

Description: CoCA’s Archives Project has a detailed finding aid that covers the full archives collection of the organization, including exhibition-related materials and organizational files. Digitized materials, available through TinyCat, have well-formed MARC records and organized by related exhibition as well as through keyword search.

Low resource recommendation: Add information on which materials have been digitized (including file formats, when appropriate) to the finding aid for the archival collections, to aid in identifying existing digital materials, their place within the larger collection, and what materials will be prioritized for upcoming digitization projects.

Medium resource recommendation: Establish documentation on digitization and descriptive metadata in order to aid in standardization. Store transformative metadata on digitization within TinyCat records.

High resource recommendation: CoCA may want to look into a more robust content management system, which could aid in uniformity of metadata and allow for growth of collections in the future. However, TinyCat seems perfectly adequate for CoCA’s current and near future activity, so I would not recommend considering a CMS change and migration unless the organization significantly expands.

File Formats

NDSA Level: 2

Description: CoCA’s digital archival holdings are primarily digitized from physical holdings, so archival consultants and interns have been able to control file formats. An rough inventory of all file formats currently in the collections already exists. New incoming material documenting exhibitions is likely to be born-digital (photographs, documents), so establishing standardized policy on acceptable formats is likely to be helpful on this front.

Low resource recommendation: Place inventory lists of existing file formats into documentation for future archival staff, volunteers, and interns.

Medium resource recommendation: Establish preferred formats list with a particular eye to born-digital material and share this list with CoCA’s executive director, board, and other departments’ volunteers to aid in documenting and transferring that future documentation of exhibitions to the archives. Identify file formats in the archives that are in danger of obsolesce.

High resource recommendation: Begin format migration for materials on the danger-of-obsolesce list.

Conclusion

CoCA’s Archive Project benefits from knowledgeable and passionate consultants and volunteers, but the lack of regular funding makes it difficult to plan for a sustainable future. However, there are several low resource baseline steps that archival consultants and interns can take on to help secure digital collections and plan for their continued preservation.

Putnam County Museum Plan

Putnam County Museum Plan

I joined this class thinking it would be about digitization and the efforts to preserve physical objects digitally, but the class was so much more than that. We did discuss the realities of digitization projects and how that relates to the broader problem of digital preservation, but I learned more than I anticipated about the different ways that professionals are working to preserve all their important digital records.

 

  1. The most useful idea I learned because of this class (I think), was the levels of preservation. This will come in handy no matter what type or size of institution in which we work. A large organization could always do better in one of the areas. A smaller organization will likely need the help to move through the levels towards full digital preservation.

 

  1. I also found the process of interacting with an external organization a useful experience. I expected the interactions to be seamless and easygoing; however, the process became strained quickly and I had to work to find an organization willing and able to work with me on a more active basis. While I don’t believe consulting is the future career for me, I now understand the effort the person receiving the consultation would have to expend.

 

  1. The final useful aspect to come out of this class is the importance of learning from others in a similar situation as you. I haven’t been able to attend many conferences, so this was a new experience for me to learn from my colleagues’ experiences. For instance, class sessions where we shared the progress we had made on our consultation projects routinely gave me ideas for how to best help my organization.

 

Open Question for the class:

I would like to know who (if anyone) in the class is now considering a career as a digital preservation consultant. Many of us began the semester anticipating a career in a cultural institution, so I wonder if the real-world project we worked on this semester has had a significant impact on anyone’s career goals.

 

Finally, I want to thank my classmates and Professor Owens for a great, thought-provoking semester discussing various digital preservation challenges and solutions ranging from the “in a perfect world” to the “this is the simplest, quickest, and most practical for this particular organization/challenge.”

End of Semester Reflection

Before taking this course, I thought you had to have a certain amount of technical know-how to be able to do digital preservation and I didn’t realize how much planning and policy work was involved. Being able to work with a real organization was definitely useful for me, and hopefully they’ll be able to use some of the work from this semester to guide their efforts moving forward.

The Importance of Positive Thinking

I was intimidated by digital preservation at the start of the semester and I think my organization was also feeling like they didn’t have the knowledge or resources to improve their practices. For example, they originally told me that they didn’t have extra copies or backups of their files and they felt like they needed a fancier storage system. However, after some additional conversations, they mentioned that they had access to Google Cloud and Dropbox. These are actually good options for a small organization and the museum has enough storage space available to last for awhile! But the staff didn’t think of cloud storage as a valuable resource until we progressed through the various assignments. I feel that part of my job as a consultant was helping them to see that they weren’t as doomed as they thought they were. Staying positive doesn’t mean that we should ignore problems where they exist, but sometimes it just takes a little creative thinking to figure out a solution.

Sometimes the Rules Don’t Matter

It’s nice to be able to consult models like OAIS but organizational context also matters. Even with the Levels of Digital Preservation, I felt that some of the higher level steps weren’t as applicable to my particular institution. For instance, I didn’t want them to worry as much about file formats because they tend to use very common formats and don’t have the resources to migrate or emulate files. I gave them a range of options to consider, but I know that they’re probably going to focus on the easiest actions. Even if the museum only reaches Level 1 or Level 2, this will still be an improvement. It’s important to avoid getting caught up in things like SIPs and DIPs if it’s just going to confuse people.

Theory and Practice

There’s no one way to do preservation and it’s important to clarify your intentions for preservation before you start trying to take action. I came into this class thinking I needed a list of steps to follow and an understanding of different tools for digital preservation. But it turns out that I enjoyed the more theoretical readings and I found that an understanding of the different frameworks and lineages for preservation was just as important.

Baltimore Community Museum_Final Report