If your oral history project includes the creation of any sort of digital audio or video files—and it almost definitely will—you are going to need to make some informed decisions about what file formats you are going to use to store your data.
Kara Van Malssen’s article, “Digital Video Preservation and Oral History,” offers a highly practical introduction to how you might begin to make those decisions. If Van Malssen leaves you wondering what the big deal is about formats anyway, the “Format Theory” chapter of Jonathan Sterne’s MP3: The Meaning of a Format offers an interesting historical look at what formats mean and how they develop.
Digital video preservation
When you’re creating digital video files, it’s not great form to just pick up a camera and jet off to the races. The decisions that you make in the earliest stages of a video’s creation have lasting implications for its preservation later on.
Van Malssen provides a very helpful in-depth look at these decisions, but for the purposes of this blog post, we’ll content ourselves with understanding some basic information about file formats and reviewing some of Van Malssen’s overall recommendations.
Anatomy of a video file
The important components of a digital video file are the file wrapper and the encoded video and audio tracks.
The file wrapper dictates what we’d think of as the format, which gets represented as an extension. The file wrapper binds the video and the audio tracks together and stores metadata. Some common extensions for video files include:
When we talk about encoded tracks, we’re acknowledging that within the file wrapper, the audio and video tracks are created using different codecs. These codecs encode the tracks for storage and then decode them at the moment of playback.
Van Malssen offers several examples of popular codecs:
- DV (Digital Video)
- Apple ProRes
Understanding the makeup of your digital files is key to preserving them. Now, let’s review some of Van Malssen’s best practices for preserving your files.
Recommendations for digital video preservation
- Choosing a recording device: Get one that uses one of the codecs listed above (others might be hard to support, and may not even be playable one day) and that produces video at the highest bit rate you can possibly support. You can always compress your video to reduce its file size, but you can never restore bits that weren’t recorded in the first place. It’s like the opposite of seasoning while cooking.
Transcoding is moving a file from one encoding format to another. It always results in a loss of quality, so
transcode judiciously! What does that
look like? Basically, make sure you keep
different versions of your file for your different purposes.
- Creating a preservation master file: The point of a preservation master is to keep your original footage intact at the highest possible resolution. You can use it to create new versions of your file, but you want to preserve the original file’s integrity as much as possible. Store this safely, and don’t replace it with any of your edits!
- Creating a mezzanine file: This will be your working copy, which you use to create new edits and proxies as you need them. If you don’t need to make any dramatic changes to your file size, you may not need a mezzanine.
- Creating a proxy file: This is your low-resolution file that you use for distribution, especially online.
- File naming: Use a clear, consistent file name convention to make managing your collection easier.
- Metadata: Similarly, use consistent, descriptive metadata. Van Malssen recommends using a tool like MediaInfo to collect technical metadata output attached to your files, and to use standards such as the Library of Congress’ VideoMD or the Corporation for Public Broadcasting’s PBCore to keep it consistent.
- Storage: Store at least two copies of your preservation master in two different storage locations, even in two separate geographic locations. Your files can degrade over time as your storage material decays, and having more than one copy of your master on more than one storage medium is a good way to safeguard against that! Storing smaller mezzanine or proxy files in the cloud can be a good idea, but your preservation masters should be stored on hard drives, data tapes, or both.
- Preservation planning: Use open-source, standard file formats and codecs, like those listed above, to keep your files accessible long-term. Keep up with the technological landscape so that you know if the file formats you’re using are at risk of becoming obsolete, and keep your original files in as high of quality as possible to ensure for the best possible outcome if you do need to transcode them.
After all that discussion of the practical implications of formats, Jonathan Sterne’s “Format Theory” chapter interrogates the idea of a format. The MP3 is the most common audio storage format used today, but, as anyone who’s ever spoken to a person who really cares about headphones knows, it’s certainly not the audio storage format that allows for the highest quality. So what gives?
Put simply, the MP3 is small. It compresses recorded audio and uses significantly less bandwidth than other formats, which is ideal for transferring files and communicating. In contrast with Van Malssen’s advice to keep your files in the largest format you can, the proliferation of the MP3—a super-compressed, lossy format—puts a premium on distribution over preservation of quality.
Sterne explains this by contextualizing the MP3 within a history of compression. “As people and institutions have developed new media and new forms of representation, they have also sought out ways to build additional efficiencies into channels and to economize communication in the service of facilitating greater mobility,” he writes. Over the course of time, so many of our attempts to make media more widespread and easier to share have resulted in compressing the media.
When he argues for the importance of format theory, Sterne encourages us to view formats as a part of history, entrenched in a context that reflects the cultural moment in which they become popular, as well as the operational and industrial needs of that moment.
Both of these pieces of context inform which formats become popular. Culturally, in the case of the MP3, many people prefer distorted audio over verisimilitude, and many prize easily sharable audio over very high-quality audio. In terms of operational needs, I remember an instance years back, in which a friend shared an album with me in AAC format, which allows for higher-quality sound at about the same bandwidth as an MP3 file, and I was frustrated because I wanted to burn a CD to listen to in my car, but the software I had available would not let me burn AAC files to a disc.
Considering this outside context complicates the idea of formats progressing in a linear fashion to higher and higher quality and explains why some formats succeed and others don’t.
How does understanding format theory enhance our understanding of digital file preservation? What are the implications of the proliferation of the MP3 on the prospect of preserving modern files?