In creating the archival information package for ThruYou, I had several types of material to collect: The final Thru-You videos, the source videos used to create each video, the YouTube webpage for each ThruYou video, the ThruYou webpage, screenshots of each YouTube and ThruYou page, YouTube comments for each video, the Researching ThruYou webpage, and two commentaries from source video creators (a Reddit thread and a YouTube video).
For the eight ThruYou project videos, I used Complete YouTube Saver, a Firefox extension, to download the videos in mp4 format, along with each YouTube page, descriptions, and annotations. This was a slight modification from my original plan, which was to use the youtube-dl tool for the video downloads. However, since this extension had similarly extensive download quality options, as well as the ability to download the pages, I used it for the ThruYou videos instead. My intention was originally to collect all YouTube comments for each of the Thru-You videos, using a Firefox extension, Complete YouTube Saver. Unfortunately, recent changes in the way YouTube loads their pages broke this functionality in the extension. A smaller selection of the most popular comments does appear in the screenshots for each YouTube page, so this aspect is preserved to some degree; however I would ultimately prefer to find a way to pull the full comments list via the YouTube API.
For the source videos, I did use the youtube-dl program. This is a command line tool which offers a wide range of options. In order to streamline the process of downloading over 100 videos, I created a batch file containing the URLs for all source videos, which allowed me to download the full collection without having to restart the process for each video. Arranging and organizing these files was a challenge. The source videos are linked from the description section for each ThruYou video, with a brief descriptive note (“drums 1”, “toy piano”, etc.). However, these descriptive notes do not match the video titles, potentially making it tricky to line up which video is which once the YouTube links no longer function. In addition, some of the source videos have been removed from YouTube and are missing from the collection. I wanted a researcher to be able to connect the ThruYou videos to the source videos listed in the YouTube description. I ended up numbering the source videos 1-109 based on the order they are listed in the YouTube descriptions (which is roughly the order of their appearance in each video).
I then separated the source videos into folders for each ThruYou video, so each folder contains the relevant source videos in the order they appear in the YouTube description. To keep all this straight, and to note missing videos, I created a “key” to these source videos listing their original YouTube link, the video title, and the label Kutiman used for each in his description.
I divided the documents into four series folders: 1. ThruYou videos, 2. Source videos, 3. Web pages and screenshots, and 4. Contextual information. As a final step, I included a PDF finding aid briefly describing the collection and the arrangement. In the long run, I would want to prepare a more detailed finding aid with a full series and file listing and more detailed information on each type of file and on the project as a whole.
Overall, the process of finalizing the AIP was a useful reminder of how often unforeseen challenges appear once you begin to work with the collection in practice rather than theory. However, I found that thinking through the project in such detail beforehand not only minimized issues, but also provided a framework to turn to when I needed to make changes to the original plan.
My AIP can be viewed and downloaded here. Note that this is not actually the full collection; while I downloaded the full collection, the final zipped file was well over 1.5 GB, so I created a version with only the source videos for the first ThruYou video included. (I left the folders for the other 8 videos, but deleted the files.) This gives an idea of the structure to be used, but is a much smaller file (although still around 400 MB). In practice, I would envision my collecting institution maintaining an AIP similar to this for preservation purposes, and then producing an access copy for research use and general access.