Hopes and Dreams: An Undertale AIP

In my last post, I outlined some of the ways I hoped to structure an Undertale collection. Here, I have an actual image of what that might look like. Because there are a number of issues dealing with ownership here, I only have a mock-up with a few files, rather than the whole collection.

Undertale Model Archival Information Package
Undertale Model Archival Information Package

The three main folders are as follows:

  • “Datamined”: Information gathered from datamining, as presented on Mirrawr’s website. This includes a number of HTML pages and at least one PDF.
  • “Videos”: These are videos of Let’s Plays of Undertale. Includes a full Neutral/Pacifist run and a full Genocide run, with selected and compartative clips in the “Consequences” folder.
  • “Wiki”: Contains all pages from the Undertale Wiki.

Aside from the issues I raised in my last post, here are some considerations I made about this material.

File Types and Acquisition

For the purposes of this AIP, I have acquired the files simply by going to the page, right clicking, and hitting “Save Page as…” While this may have its problems (and indeed certain elements like ads broke), it saves the pages quickly and seemingly with a good bit of extra data. Because I am not concerned here with the look of the wiki as much as the text on it, these HTML documents do well even in a text editor to give users/readers the information needed to understand Undertale. Ideally, I think a software like Archive-it would be ideal, but not crucial. PDFs could also be obtained, but I rather like having a bit more of the HTML stuff available for readers, even if that isn’t my purpose in creating this.

A point raised in a number of the other posts was about how often such a website would need to be checked, to make sure the archive had current information. Although the game is still somewhat new, much of it has been rabidly consumed by fans, and so I do not anticipate great changes in coming years. So, perhaps once a year for the next five years, and then once every five years after that, an archivist could check for changes and update if necessary.

For the videos, I used a website to download the .MP4 of a let’s play. While this isn’t the best quality, because the graphics of Undertale are MEANT to be seen as a somewhat low-fi/throwback development, the lack of video quality is not a huge issue. Ideally, getting the original files from a gamer would be most ideal.

Link to Other Places

One great point Amy raised in a comment was how this could be a great opportunity to link from one section of the package to another, and you’ll see I did that here. In the Genocide video folder, I linked to the wiki entry on a Genocide run of the game– that way, some of the materials in the collection can more easily serve as documentation for other bits of the collection without taking up more storage space. Primarily, I see the videos having links to materials from datamining and the wiki, not the other way around. The exception to this would be perhaps videos that show all the fates that could reach one character (put in the “Consequences” folder); those could be linked to the wiki entries on individual characters.

This allows views to see the collection’s connections between different elements, rather than requiring that they make such connections themselves.


I have a documentation word document in each folder. This would likely include a few things: an index of sorts, notes on authorship, and dates of access/acquisition of the materials, to start. The last is particularly important for the websites, which could change at any time, and if they did change, the documentation could help archivists keep track of how many versions they have.

This kind of data would be crucial for the videos, which when downloaded in the way I did, lose much of the metadata YouTube stores about them (user who uploaded them, date of upload, etc). Having some idea of when the videos were uploaded to YouTube helps place the LPer within the span of “Undertale’s history” as well: how they play the game depends a bit on how well they know it; if they made the LP in early days, then they might have been shocked/surprised by things, and that can be good data for the user to have, even if the LP does not contain commentary.

The set of items that needs documentation the most is the Mirrawrs datamined materials. While some of this is self explanatory, much is not, and it seems imperative to have a certain level of description about how the data/numbers work presented here. While I do not myself have such information, it might be worth doing a quick interview with Mirrawrs, include it in this section, and use the data from such an interview to inform the use of the documentation on the datamining section.

Conclusion: Is this the final battle?

I think there’s a good bit of room to tweak elements here. I like that this structure allows for users to enter into the videos or wiki section without knowing much about Undertale, and they’ll quickly get a good understanding of the material, but the datamined section is much more challenging to interpret, and even with detailed documentation, I think it would be a challenge. Still, I think this is a place to start documenting the video game known as Undertale, and hopefully additional resources will contribute to popular and scholarly understandings of this game.

The Lizzie Bennet Diaries- a Social Preservation

The TLBD end


The preservation plan for The Lizzie Bennet Diaries will consist of the main character’s social media accounts; including, but not limited to Twitter, Facebook, Tumblr and other related accounts for each character. I decided not to preserve the videos themselves, because they have already been properly preserved in the form of YouTube, streaming, and DVD formats. In addition to creating videos that were directed to their audience, the “characters” interacted with their audience in real time. The majority of the social media accounts are no longer active, however, they still remain live. I have chosen to utilize the company ArchiveSocial to capture and preserve the social media accounts.

Continue reading “The Lizzie Bennet Diaries- a Social Preservation”

Bot Preservation: Two Headlines AIP

To Begin…

I’m not going to try to not repeat myself too much as I’ve already a lot about Two Headlines here and here. So, before I get into the archival information package, Two Headlines is a small bit of programming that combines two headlines from the Google News API and posts them to Twitter through the Twitter API with the help of some bits of code that are freely accessible to programmers through Node.js. Two Headlines has been used to teach programmers about creating twitter bots and it is a form of social commentary. Its tweets are also funny and entertaining.

As there is software that needs to be installed involved, creating readmes that include instructions on how to install and operate the programs should be created. It does no one any good to include software that doesn’t have instructions, especially as the software is not designed to be used by people that have little to no programming experience.

Since this is just a model AIP, there are only a few files represented. The AIP will consist of three main folders, one for the bot’s source code and the software and documentation to edit that code, one for any interviews or comments about the bot, and the final one for the tweets themselves and the software that to read them and its documentation. While the file types for things like the source code for the bot and the installer files are already dictated by their creators, any new files created will be to current preservation best practices, PDF/A for the text files and .tiff for images.

  1. Code
Folders structure of the AIP, highlighting the source code for Two Headlines
Folders structure of the AIP, highlighting the source code for Two Headlines

This folder contains the source code of the bot, downloaded from GitHub, along with the software used to create and edit the code. The documentation for Two Headlines’ code and for the software that created it will also be included. Additional documentation for the Google News API and the Twitter API will also be added, as both APIs are used in the running of the code. A readme file with some instructions concerning installing and using the various software was created, mostly from the instructions and

Folders structure of the AIP, highlighting the software
Folders structure of the AIP, highlighting the software

other readme files associated with the programs, is also in the folder.


  1. Interviews

Any one that responds to questions about their interactions with the bot, its significance and influence, will have their responses preserved in this folder. News articles and blog posts will also be included here.

  1. Tweets

The tweets will have the third and final folder. As archiving the tweets will require special software to collect them and different software to read them, both the programs and their documentation will also be included. Another

Folders structure of the AIP, highlighting the tweets
Folders structure of the AIP, highlighting the tweets

readme file will be added so that any users know how to install and use the included software to view the tweets. It will also include metadata about the collection of the tweets, including the time and the code that collected the tweets and a record of any modification that was done to them post-collection. A few screenshots will also be provided to show the original Twitter interface that will not be archived with the tweets themselves.


Moving Forward

While this is a good start to preserving an entertaining bot, there is more work that could be done. The next steps for this project would be to actually conduct the interviews and acquire permissions for the news articles and blog posts and submit the AIP to the Internet Archive. There would also need to be a mechanism in place to collect the new tweets from the bot, as it posts every few hours, and add them to the preserved files.

Homestar Runner Archive AIP

IntroductionMountains Photo

Since my core content is already well preserved on the Internet Archive, YouTube, and the original site, since the contextualizing information has been meticulously captured by dedicated fans and posted to the HR Wiki, and since the community remains active through the Homestar subreddit, doing the actual preservation work was in the end highly redundant. The work has been done and well. So instead, here is the intellectual exercise of how to organize the data were it to be captured and preserved by a different kind of organization: a major research university.

While the argument has been made for preservation by original order, homestarrunner.com is organized in categories and then largely chronologically, but not purely so. The content here is organized to reflect as closely as possible the organization of the content as it appeared on the site (with major groups first and then release order) while still making it easy to navigate for novices and scholars. In addition I’ve broken out the content of the creative and auxiliary content discussed below.


Archive Organization

Part 1: Administrative Data

The first part of the Archive is the Administrative Data organized under the ReadeMe folder. This folder contains the information that captures intellectual control of the materials contained in parts 2-4 of the collection.

The files in this section include:HRA2

Parts 2 & 3: Content

Sections Two and Three house the creative content of the Homestar Runner Archive.

Part 2: Primary Content. This folder includes the featured content from homestarrunner.com, the areas of the site that were updated most often, featured most heavily, or most often seen by new fans:


Each of the individual folders for a primary video contains both a master copy of the video as well as a data file (to accompany the imbedded metadata). The Master Video file will be the best available copy of the most recent and accessible version, the least compressed; in short, the version from which future migrations should be made. The Data File will contain up-to-date curated content from the HR Wiki, plus additional information as relevant for each file. The Data Files will contain transcripts, references, links, Easter egg lists, and routing. For example, Teen Girl Squad #1 is actually a Strong Bad Email, so instead of duplicating the content, the folder for TGS #1 will route to the appropriate SBE folder.

HRA4Part 3: Secondary Content. Everything else. This folder contains everything else produced for the site.

Like the Primary content, this will consist mostly of master copies of AV content, such as the website homepages or music videos, and the associated data files. However, it also includes free downloads, collaborative materials, the live-action puppet videos, and merchandise lists to help collectors track licensed products.


Part 4: Auxiliary Materials

Section Four comprises materials relevant to a full understanding of the Homestar Runner Archive, but not a part of the creative content.




Next Steps

With the content so well preserved in so many locations, the next logical steps for a research institution interested in the not only the social impact but the research potential of this collection would be to make overtures to the creators to see what interest exists in the preservation of the creating tools, what oral histories could be captured about the creation and creative process, and what original media still remains that might bolster a digital collection for the edification of future generations.