To provide a brief recap, my project is focusing on the management and preservation of the StratComm digital image collection. StratComm is an in-house communications department for a not-for-profit company with offices in the Washington D.C. and Boston areas.
In most cases the current preservation state of StratComm’s digital image collection displayed aspects of Level 1 requirements, but did not meet them all. For example, under the category of Storage and Geographic Location, StratComm did meet the requirement for having at least two copies of its files, however those two copies are collocated with each other, making them both vulnerable to the same threat profile. In the category of Information Security, StratComm does require a password to log on to the multimedia server, however once logged on all users possess read, write, and delete access. This makes the original files vulnerable to accidental deletions or changes that overwrite the original.
The two areas in which StratComm earns drastically different rankings is in File Fixity and Metadata. StratComm currently does not have any tools for performing fixity checks to ensure data integrity during any stage of use or storage. As such they do not meet any requirement for any level in this category. However, in the category of Metadata StratComm has been applying standard technical and descriptive metadata embedded in the images for a number of years, albeit with differing approaches that have evolved with the introduction of new tools and new guidance from changing staff. The application of this type of metadata satisfies the requirements for Level 3 of the Metadata category.
The following digital preservation plan will be broken down according to the categories detailed in the National Digital Stewardship Alliance’s Levels of Digital Preservation rubric. While these requirements may seem prescriptive it should be noted that effective digital preservation is performed in the context of the organization’s preservation intent, and that it is an incremental process (Schumacher et. al., 2014, p.15). That is to say that choices that might make sense for the Library of Congress may not make sense for a multimedia production house like StratComm, and vice versa. It also means that organizations are best advised to focus on the next logical and achievable steps to improve preservation rather than stress about trying to meet the highest standard all at once. Forward progress is best maintained by concentrating on the next achievable steps. Towards that end each section below will include recommendations that require low, medium, and high resource expenditures. I recommend concentrating on the low and medium recommendations as more immediate actions to take. The recommendations requiring higher resource expenditures are included for purposes of long-term organizational planning and to highlight opportunities to partner with other groups whose functional mandate may be more appropriate for addressing the preservation need in question.
Storage and Geographic Location
At present, StratComm does not quite meet the criteria for Level 1. In most cases, two distinct copies of all files do exist, but they are collocated on a backup server that resides in the same location as the original server. While this does protect StratComm from the possibility of hard drive failure it leaves them vulnerable to disasters that could potentially affect both systems, such as floods or fires. Washington’s photographer recently completed a migration of all data residing on optical media for the 1995–2009 segment of the image collection. This is a positive step, however that data too is now collocated on the same servers. Technically the copy on optical media does mean that three distinct copies exist for that segment of the collection, but this media should not be relied on due to its age, and that optical media is physically located on the same floor of the same building as the multimedia server that houses the other two copies, giving it the same threat profile.
Low resource measures. Probably the shortest line to achieving Level 1 status in the Storage and Geographic Location category would be for StratComm to take advantage of their bifurcated office configuration and begin to back up each location’s multimedia server to the other location’s multimedia server. This would establish a third copy of every file, but more importantly, at least two of those copies would exist in different locations with different threat profiles. In a single stroke this would satisfy the highest Level 4 requirements for number of copies and location of those copies.
The remaining requirements for Level 2 call for an audit of the current storage system and what is required to maintain access to it (Phillips, Bailey, Goethals, and Owens, 2013, p.3). This is a simple matter of a small amount of staff time to perform the inquiry and requires a fairly low expenditure of resources.
Medium resource measures. The remaining requirements for Levels 3 and 4 require ongoing, but still minimal amounts of staff time for monitoring and planning activities. For example, to meet Level 3 requires the establishment of a process for monitoring obsolescence issues associated the organization’s storage systems (Phillips et. al., 2013, p.3). This could come in the form of a simple annual audit of StratComm’s current storage status and an analysis of current trends in storage options.
To meet Level 4 requirements requires the creation of a comprehensive plan to maintain both files and metadata on an accessible media or system (Phillips et. al., 2013, p.3). While this is not a terribly time intensive effort it probably represents the highest degree of time spent in active monitoring of current technological and industry trends and careful analysis of how those trends fit into the context of StratComm’s preservation intent and workflow. The inter- and intra-departmental collaboration required to form this plan would likewise require a moderate expenditure of staff time.
High resource measures. StratComm could further diminish the threat profile of its three separate copies by establishing a third full copy of all files in a commercial cloud server such as Amazon Web Services. This would place a third copy in a third location with a different threat profile from the previous two copies. This diversified threat profile also establishes a new controlling organization as an additional level of potential protection. For example, if some catastrophic network event were to cripple all of StratComm’s computing power the copy in the cloud would likely remain accessible.
I have this option listed as a high resource expenditure due to the complications surrounding StratComm’s public release process and the effects that process has on the company’s approval of cloud storage. To make this option happen would likely require a fee for the commercial service as well as the engagement of multiple employees across many departments including StratComm, IT, and Information Security to gain corporate approval and establish non-disclosure agreements with the vendor.
File Fixity and Data Integrity
At present StratComm does not employ any measures for checking file fixity or data integrity. As such they do not meet the requirements for any level of the National Digital Stewardship Alliance’s Levels of Digital Preservation rubric.
Low resource measures. StratComm could avail itself of free cryptographic hash generators such as onlinemd5.com to both create checksums for their data as well as verify the existing checksums to ensure data integrity has not been compromised. This hash could be stored in a simple text file inside the folder of final images. One consideration is that the high volume of images produced by StratComm may preclude performing fixity checks on an item level. It may be more time efficient to package all final files into a zip file and perform the fixity check on that. But aside from the staff time involved with manually performing the tasks, all of these efforts employ freely available or already existing tools.
Medium resource measures. StratComm might be able to use corporate initiatives such as the “Meaningful Work” program to engage with programmers in other areas of the company that are still waiting for sponsor tasking. These programmers could automate certain tasks, such as the regular intervals of fixity checks required of Level 3, or perhaps the checking of fixity before and after transfers. Transfers of data from one system to another are one of the biggest points of vulnerability during which degradation can occur (National Digital Stewardship Alliance, 2014, p.3). Checking the fixity before and after such events is a Level 4 requirement (Phillips et. al., 2013).
Another option would be to approach EnterMedia’s software engineers about the possibility of building fixity checks into StratComm’s present DAM system. StratComm already owns this tool, and the vendor is particularly engaged and open to producing custom solutions to such problems.
High resource measures. There are tools available, such as AVP’s Exactly, which will help with the automation of many of the tasks associated with checking fixity. These are open source options, and many are even free of charge, however as Owens notes in Theory and Practice of Digital Curation, these should be regarded as “free puppies”, not “free beer” (Owens, 2018, p.116). Open source solutions may not cost anything to download, but they require a great deal of setup, care, and maintenance, often from multiple departments across the organization. Additionally, since these solutions are often the product of non-profit, cooperative industry partnerships it is important to factor in the active role that StratComm may need to assume in the community stewardship of that tool in order to help ensure the tool’s continued existence and make sure that it continues to serve StratComm’s goals (Owens, 2018, p.116).
A tool such as AVP’s Exactly may be overkill for StratComm to procure and stand up on its own, however it might make sense for StratComm to partner with Corporate Archives to lobby for the resources necessary to implement such a system. This shared capability would benefit the collections held by both departments. The significant need for a budgetary increase has been hidden for many years by the organizational gap between the two departments. On one side, StratComm recognizes the vast scale and importance of their collection yet tends not to view preservation as a role within their purview, so no formal process for transitioning content to Corporate Archives exists. On the other side, Corporate Archives does view preservation as their function, however they have no current foothold in any stage of StratComm’s digital asset lifecycle, so they are in no position to properly assess or appropriately plan for the care of such a large collection. A StratComm partnership with Corporate Archives would present a unified voice to articulate the true scope of the problem and the resources needed to address it.
StratComm does presently require a password to access the multimedia servers, however no additional security provisions are in place to restrict access to the photography portions of each server, or to restrict changes and deletions to the files that reside there. For the most part the only individuals that access these servers with any regularity are the photographers and other multimedia staff, but login information has occasionally been provided to design staff in the past with deleterious effects. For example, it was discovered a few years ago that someone had moved an entire folder of images to a new location on the same server, but could not remember the new location. This broke the link that Lightroom had with the original files, making them unavailable to photographers for further use, effectively deleting them.
While the incident described above was the only example of data loss that I could find it does highlight a significant vulnerability as well as a gap in understanding among the multimedia and design staff about how to safely interact with the files on the server.
Low resource measures. The requirements for meeting Levels 1 and 2 of the Information Security category involve simple OS-level changes to read/write settings and perhaps the creation of a department policy that stipulates safe avenues of access for design staff (Phillips et. al., 2013). Such efforts are free of charge and would satisfy all requirements for Levels 1 and 2.
Medium resource measures. To meet the requirements for Levels 3 and 4 would require the acquisition of new software, or the augmentation of current software, to monitor changes and produce activity logs (Phillips et. al., 2013). One example of such an option would be to engage with EnterMedia’s software engineers to customize a solution that would generate audit logs.
While this is an option of medium-level cost it is worth mentioning that it may provide very limited return on investment. EnterMedia already does not provide access to originals, it simply allows the user to download a copy. Thus the original is never in any particular danger. Plus StratComm’s desire to shift the primary access point for all users to the EnterMedia DAM system provides its own level of protection for the original files that reside on the multimedia server. For the assets on the multimedia server it might be worth exploring the option of write blocking software under the condition that the implementation of such a tool must not present any undue complications to StratComm workflow.
High resource measures. Following on the discussion from the medium resource measures above, it bears mentioning that a significant restructuring of interdepartmental processes and staff roles should be considered. The functions outlined in Levels 3 and 4 would most logically reside with Corporate Archives. This is another example of where it would be beneficial for StratComm to partner with Corporate Archives to lobby upper management for an increase in resources to build a shared, corporate-level ability to manage and preserve digital assets.
Photographers in both Boston and Washington currently apply standard technical and descriptive metadata to all images, albeit with some variations in method. This satisfies the requirements for Level 3 of the Metadata category, but leaves the requirements for Levels 1,2, and 4 unmet. It is noteworthy to point out that several aspects of the requirements for Levels 1 and 2 are points of crosspollination with the Storage and Information Security categories discussed above. As such, measures taken to satisfy those requirements would perform double duty in satisfying requirements in this category as well.
Low resource measures. Performing a formal inventory of content and its storage location would be a relatively low expenditure of effort for a staff member to perform. It could even be argued that the Survey and Report that preceded this document may actually satisfy the requirement. However, since these locations do not change drastically very often, this inventory would at most need to be done annually. This action, combined with another previously mentioned low resource measure – namely, the trading of backup copies between the branch offices – would quickly achieve a Level 1 status for StratComm.
Medium resource measures. StratComm could approach EnterMedia’s software engineers to inquire about the platform’s ability to track transformative and preservation metadata. If this customization to an existing StratComm platform is possible, it would perform the necessary functions for Levels 2 and 4. However, these changes might be outside the realm of possibility for EnterMedia, and it is also worth considering if EnterMedia occupies the most optimal point in department workflow to act as a gatekeeper for tracking change history and version control. It may be more logical to explore options that integrate into StratComm’s workflow management tool, Wrike.
High resource measures. There is a dearth of good options that meet both the strict archival standards outlined by the NDSA while also meeting the needs of a high-volume production house like StratComm whose high rates of transformative activities create version control issues that would overwhelm archival tools meant to catalog static files. Open source, archival platforms such as DKAN could easily meet all of the standards outlined in each level of the Metadata category, however such tools are not very interoperable with the graphics platforms used by StratComm, making it likely that design staff would simply sidestep the DKAN interface and change histories would be lost. There are proprietary commercial tools such as Adobe Experience Manager (AEM) designed to perform all of these functions, however AEM’s cost is extreme (in the millions) and the proprietary nature of the platform makes the formation of an exit strategy to a subsequent tracking system exceptionally difficult. Considering these factors, StratComm’s acquisition of either DKAN or AEM, while technically an option, is honestly not recommended.
A more logical and cost-effective option to explore would again be to partner with Corporate Archives to advocate for the acquisition of a shared archival tool to enhance the corporation’s capability to preserve digital object metadata. Part of this would necessarily involve a focus on the creation of preservation policy to formalize a pipeline for content to flow from StratComm to Corporate Archives.
StratComm currently satisfies portions of each level of the File Formats category, but does not completely satisfy any one of them. For example, it could be argued that StratComm has chosen their currently used file formats based on an informed assessment of user community needs and long-term access issues. However, all of the formats they use are technically proprietary. It could also be argued that the Survey and Report document that preceded this paper could be considered an inventory of formats in use. However, the scope of that document was more of a broad overview than an in-depth analysis. Achieving Level 2 will likely require a more dedicated audit of file formats, as well as a deeper examination of the specific issues associated with each one, particularly the use of DNG and the larger role that RAW files play in StratComm’s preservation intent and service model.
StratComm comes closest to meeting Level 3. Both photographers remain abreast of current industry trends in digital imaging, and both make long-term access a key driver for their format choices. The largest point of diversion is in RAW formats: a class of files for which no non-proprietary choice exists and longevity of access is an open question for all. The photographers’ diverging approaches in this area are each supported by thoughtful reasoning, but are equally debatable and based on significantly different preservation intents. This lack of consistency is another argument for the need to perform an in-depth inventory of file formats, a re-examination of StratComm’s preservation intent, and a subsequent synchronization of policy.
Low resource measures. The actions required for meeting levels 1, 2, and 3, are all fairly modest efforts. In fact, the in-depth inventory of file formats recommended above would satisfy Level 2, and go a long way towards informing the best choices available for satisfying Level 1.
High resource measures. StratComm’s legacy holdings are a complex panoply of image file formats, made all the more challenging by the occasional lack of file extensions and a high prevalence of an obsolete Kodak RAW format. Tools such as DROID (Digital Record Object Identification) will help identify files without extensions, and tools such as Preservica or Rosetta would help facilitate format migrations to maintain functional access to files. However, such commercially available tools require ongoing fees and a high degree of effort to install, configure, and maintain. Additionally, since the last known RAW converter for the Kodak DCS465 required an obscure plug-in for Photoshop 5.5 on Mac OS 9 the effort and expense of emulating that environment for the purpose of migrating those files may be considerable.
Due to the high level of effort involved with these measures, it is again recommended that a partnership with Corporate Archives be considered to build out a shared capability for digital preservation that meets the needs of both departments.
As stated in the introduction, digital preservation is a process that is necessarily informed by the context of the organization’s preservation intent, and it is incremental in approach (Schumacher et. al., 2014, p.15). As a high volume production house it may not make sense for StratComm to strive for the highest levels of the NDSA Levels of Digital Preservation rubric because it is not within the scope of their corporate function. At the same time the interests served by striving for those upper levels of the rubric do represent a critical value to the company StratComm serves. As such, it is recommended that StratComm focus the bulk of their efforts on the curatorial and preservation activities that directly enable their business goals of improved image search. Activities outlined in this report that most directly serve that goal are improvement of metadata and the generation of metrics on image usage, including change histories and version control. A reappraisal of the collections featured on EnterMedia is also warranted to bring them in alignment with current corporate structure in a way that new employees would find logical. One possible means of doing this would be to switch to a system that tags images according to sponsor rather than by center. The centers change every time the company goes through a reorganization, whereas the sponsor remains much more persistent.
Addressing the more complex preservation needs really deserves a more concerted approach that answers the needs of the company as a whole, not just the needs of any one department. StratComm has an opportunity in this area to help advance the Corporate Operations center by lending its considerable voice to the call for increased resources dedicated to the long-term management and preservation of digital objects. And the most logical place for this capability to reside is Corporate Archives.
National Digital Stewardship Alliance. (2014). What is Fixity, and When Should I be Checking It? Washington, D.C. Retrieved from http://www.digitalpreservation.gov/documents/NDSA-Fixity-Guidance-Report-final100214.pdf
Owens, T. (2018). The theory and craft of digital preservation. Baltimore: Johns Hopkins University Press.
Phillips, M., Bailey, J., Goethals, A., & Owens, T. (2013). The NDSA Levels of Digital Preservation: An Explanation and Uses. IS&T Archiving, Washington, USA. Retrieved from http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf
Schumacher, J., Thomas, L. M., VandeCreek, D., Erdman, S., Hancks, J., Haykal, A., … Spalenka, D. (2014). From Theory to Action: Good Enough Digital Preservation for Under-Resourced Cultural Heritage Institutions (Working Paper). Retrieved from http://commons.lib.niu.edu/handle/10843/13610