The Baltimore Community Museum documents the history of the small town of Baltimore, Ohio and the surrounding areas. The museum’s collections include documents from the township, papers of prominent citizens, photographs, and historical artifacts. The museum’s collections are extensive for a town with a small population and staff are currently working to gain better control over the museum’s holdings through an inventory project. The director of the museum and her interns are scanning noteworthy items as they come across them while working on this inventory. The Baltimore Community Museum is primarily scanning documents and photos that are damaged and items that are of great importance to the history of the community. Staff estimate that they have created about 600 files thus far. Issues of a local newspaper, the Twin City News, have also been previously digitized. Staff would like to ensure that they don’t lose access to this valuable content and hope to be able to make the scanned items available on the museum’s website in the future. The museum is particularly interested in developing resources for genealogists.
The National Digital Stewardship Alliance Levels of Digital Preservation provide tiered recommendations for organizations interested in preserving digital content. The NDSA divides its recommendations into five categories and suggests four levels of action for each category (Philips, Bailey, Goethals, & Owens, 2013). By analyzing the Baltimore Community Museum’s current practices and comparing them to the suggested practices in the NDSA levels, we can identify opportunities for future growth. The Baltimore Community Museum is approaching Level 1 for the Storage and Geographic Location category. The museum uses Google Cloud for photographs and Dropbox for documents, but the materials do not overlap. Staff are not sure how to conduct fixity checks on the files they are creating, so the museum does not currently meet the requirements for Level 1 of the File Fixity and Data Integrity category. The museum director and her interns are the only people who can read, modify, and delete files. While members of the community are able to come in and view the files on the staff laptop, this does not happen frequently. Staff supervise visitors who are using the laptop, so it is unlikely that visitors would accidentally delete or change files. Therefore, the Baltimore Community Museum meets the requirements for Level 1 of the Information Security category. While staff generally know what has been scanned thus far, there is no formal inventory of the digital content, so the museum is not quite at Level 1 for the Metadata category. The museum typically creates JPEG files and has also used PDFs in the past. Because this is a limited set of formats, the museum has reached Level 1 in the File Formats category. Fortunately, there are some simple steps that the museum can follow to achieve higher NSDA levels. In the following sections of the plan, I will describe a range of actions that staff could take to better manage the Baltimore Community Museum’s digital content.
The National Digital Stewardship Alliance Levels of Preservation stress that organizations should keep multiple copies of their content to prevent loss due to bit rot or storage system failure. Files should also be stored in multiple geographic locations to protect against a natural or man-made disaster in a particular region. Google Cloud and Dropbox are both solid options for the Baltimore Community Museum for the time being. These cloud storage providers can sync files to the museum’s staff laptop. It is also likely that these systems are storing files in a location outside of the Ohio region, though this process is not entirely transparent. Currently, the museum saves some files in Google Cloud and others in Dropbox because the free version of Dropbox has limited storage. One option might be to combine all of the files in Google Cloud in order to simplify tasks like fixity checks and completing inventories.
It will be essential for the Baltimore Community Museum to begin checking the fixity of its files. Fixity refers to “the property of a digital file or object being fixed or unchanged” (National Digital Stewardship Alliance, 2014, p. 1). Fixity can also be thought of as a “digital fingerprint” that serves as evidence that the museum’s files are the same as they were before (Owens, 2018, p. 60). The simplest way to monitor fixity is to keep track of the number of files being created and their expected file size. If the file size or file count unexpectedly changes, this can be a sign that there is a problem. In general, it is a good idea to check fixity information when the content is first created and before and after it is transferred to a different storage system. (National Digital Stewardship Alliance, 2014, p. 3-5). If museum staff do not have time to produce a full inventory of the digital content with detailed metadata, keeping track of the file size and file count numbers will still be better than nothing. One way to simplify this process would be to schedule a regular time each month to update the inventory and check fixity information. This would be a quick way to meet the Level 1 requirements in the File Fixity and Data Integrity category.
Currently, access to the Baltimore Community Museum’s digital content is restricted to the museum’s director and interns. The museum is also using a limited set of common file formats. Because these are stronger areas for the organization, staff may not need to make as many changes to improve their practices. The museum should consider creating formal documentation that describes the current access restrictions to reach Level 2 in the Information Security category of the NDSA levels. When interns leave the organization, any shared passwords should be changed. It would also be useful to maintain a log that employees update whenever they delete or move files. This would help the museum to meet the Level 3 requirements for Information Security. To reach Level 4 in this area, the museum’s director can perform audits of the security logs.
The museum’s director should also keep track of the file formats that the museum uses and encourage new staff members to continue creating JPEGs when scanning items from the collections. Normally, an organization would need to create an inventory of all the file formats that are in use to reach Level 2 in the File Formats area. Right now, the museum mainly uses JPEGs and PDFs, but they could create an inventory in the future if they start using a larger set of formats. To arrive at Levels 3 and 4, organizations need to monitor file formats for obsolescence and should be prepared to migrate or emulate files. However, since JPEGs and PDFs are so commonly used, these additional steps might not necessary for the Baltimore Community Museum. The staff should use their limited time to strengthen practices in other areas.
Some National Digital Stewardship Alliance members have expressed concern that cloud storage systems do not allow organizations to maintain full control over their content (Altman et al., 2013). In addition to storing files in a service like Google Cloud, the Baltimore Community Museum may want to create an additional copy of its files that is not housed in a third-party system. The museum’s files will likely fit on a USB drive or external hard drive. This additional copy could be kept in a place that is only accessible to the museum’s director for further security. The museum could also consider finding a “backup buddy” in another part of the country. This would involve trading external drives with another institution to minimize the risk of losing all of the museum’s copies in a regional disaster. No matter how the organization chooses to proceed, it will be important to document the storage system and to ensure that staff members know how to access all of the various copies. If the museum is able to maintain access to three complete copies in different geographic locations, they will satisfy the Level 2 requirement for the Storage and Geographic Location category.
In addition to keeping track of the expected file size and file count, the Baltimore Community Museum can also use cryptographic hashes to monitor file fixity. Cryptographic hash functions like MD-5, SHA-1, and SHA-256 are algorithms that “[take] a given set of data (like a file) and computes a sequence of characters that then serves as a fingerprint for that data” (Owens, 2018, p. 109). This may sound daunting, but it is possible to automate this process. AVP’s Fixity tool is a free service that can scan folders or directories and check for fixity issues. The museum can ask Fixity to monitor files on a monthly basis and send email reports when it detects changes to the files (AVP, 2018).
In addition to maintaining multiple copies of its files, the Baltimore Community Museum could also consider uploading items to the Internet Archive, which offers free storage and some support for digital preservation. This option would be appropriate for content that is in the public domain and would require museum staff to add some metadata to the items it uploads (Schumacher et al., 2014). The Internet Archive could facilitate public access until the museum is able to add more collections to its website. While more advanced software solutions like Preservica would provide greater functionality, they would also be more expensive. Because the museum is not creating a massive number of files, free options like Google Cloud and Dropbox should serve the organization’s needs for the time being. An option like Preservica may be worth considering if the museum greatly expands its digitization program.
One of the Baltimore Community Museum’s biggest challenges is figuring out how to organize and keep track of its digital content. Establishing regular file naming conventions could help to solve this problem. In order to improve practices in the Metadata category of the NDSA levels, organizations are supposed to store administrative, transformative, technical, descriptive, and preservation metadata. However, many of these types of metadata do not need to be created manually. For now, the museum could focus on creating an inventory of its files with the file location, fixity information, and descriptive metadata. Maintaining a log of fixity information is also one of the requirements for Level 3 of the File Fixity and Data Integrity category. If staff have time, they could begin adding individual scanned items to PastPerfect to gain better intellectual control over the museum’s digital content.
Some of these options are more labor-intensive than others. Even if staff only have enough time to pursue the simplest recommendations, this would still be an important step towards actively managing the Baltimore Community Museum’s digital content. Once the museum establishes basic procedures for making copies and checking file fixity, it will likely become easier to implement some of the other suggestions.
Altman, M., Bailey, J., Cariani, K., Gallinger, M., Mandelbaum, J., & Owens, T. 2013). NDSA Storage Report: Reflections on National Digital Stewardship Alliance Member Approaches to Preservation Storage Technologies. D-Lib Magazine, 19(5/6). http://doi.org/10.1045/may2013-altman
AVP. (2018). Fixity User Guide Version 1.2. Retrieved from https://www.weareavp.com/wp-content/uploads/2018/07/Fixity_v1.2_UserGuide.pdf
National Digital Stewardship Alliance. (2014). What is Fixity, and When Should I be Checking It? Washington, D.C. Retrieved from http://www.digitalpreservation.gov/documents/NDSA-Fixity-GuidanceReport-final100214.pdf
Owens, T. (2018). The Theory and Craft of Digital Preservation. Baltimore: Johns Hopkins University Press.
Phillips, M., Bailey, J., Goethals, A., & Owens, T. (2013). The NDSA Levels of Digital Preservation: An Explanation and Uses. IS&T Archiving, Washington, USA. Retrieved from http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf
Schumacher, J., Thomas, L.M., VandeCreek, D., Erdman, S., Hancks, J., Haykal, A., …Spalenka, D. (2014). From Theory to Action: Good Enough Digital Preservation for Under-Resourced Cultural Heritage Institutions (Working Paper). Retrieved from http://commons.lib.niu.edu/handle/10843/13610