Saving XKCD for the Future: Statement of Preservation and Acquisition

After taking into account the cultural importance of the web comic XKCD created and authored by Randall Munroe it has been concluded that an effort should be made to preserve the web comic and related materials. The following statement of preservation and acquisition plan have been created to clarify and guide the preservation process.

Statement of Preservation

The purpose of this preservation project is to preserve as much of the comic XKCD as possible to the best quality we can insure.  It has been concluded that XKCD should be preserved because of its significant cultural value.  XKCD’s unique content provides valuable insight into multiple communities in addition to capturing certain facets of internet culture, making it valuable to both future researchers and the community it services. It has been decided that the best way to preserve the comic is to work with other groups that are already working towards preserving it. The group that the project decided on working with/on in order to best preserve XKCD is the Internet Archive.

The Internet Archive and XKCD

The Internet Archive is an organization whose goal is to preserve as much of the Internet as possible for cultural reasons through a variety of ways.  The method that this project is concerned with is their general web archive, which they call ‘The Way-Back Machine’.  The Way-Back Machine is an archive of website pages that records what the site looked like on a certain day. For example, if I wanted to I could look up and see what the XKCD webpage looked like on November 1st 2010 in the archive. This is accomplished by taking the webpages URL and making a permanent copy of it.  As a result this system allows the internet archive to give users a reasonably authentic experience of the website at that period of time, meeting the quality standards for the project.

In addition to this The Internet Archive has already archived a significant portion of the XKCD webcomic already.  This includes over 800 saved pages and counting. however, this collection is not complete and is missing a number of entries.  There are numerous comic panels missing from the Internet Archive,  the most notable being a period of three months in 2009 where no comics were recorded and entered into the archive.  For this reason the goal of this project is to fill in any gaps in the XKCD collection at the Internet Archive, insure that any future missed content is swiftly added to the Archive, and to make sure the entries function properly.  Doing this would successfully preserve XKCD for the future and fulfill the original intent of the project.

If there are issues in accomplishing this during the process of entering the missing XKCD comics into the Internet Archive the project will preserve those pages using the ‘Archive It’ service.  Archive It is a sister program of The Way-Back Machine and is operated by the Internet Archive as well.  It is stronger, more compatible, and more secure that the Way-Back Machine however, it is a paid service.  If it becomes necessary to use Archive It the project will seek the required funds in order to preserve the problematic entries.

Acquisition Plan

The projects plan for acquiring permission to preserve is rather simple, we operate on the assumption that we already have it.  Because the web comic is in the Internet Archive’s collection already and the XKCD homepage notes that permanent URL it is safe to assume that Munroe has already decided to permit people to archive the comic.  This is doubly so when you consider how the Internet Archive actually acquires things.  The Internet Archive acquires webpages in two ways, crawlers and personal submission.  The Internet Archive uses crawlers to regularly crawl both the internet and the websites selected for preservation.  When a crawler encounters a webpage that is not in the Internet Archive it will submit the pages URL automatically.  Personal submission works just like it sounds, people directly submit a sites URL to the Internet Archive which preserves it by making it permanent.  For this reason it can be concluded that the project has ethical and moral permission to submit XKCD webpages into the Internet Archive since literally anyone is able to do so.  However, if it becomes necessary to use the Archive It service provided by the Internet Archive explicit permission from Munroe will be sought.

In regards to how the project will acquire the actual comic that to is rather straight forward.  Because the goal of the project is to ensure the Internet Archive’s collection of XKCD is complete and has no gaps in content the method of acquisition is the same as the Archives but focused solely on the web comic itself.  The project would set up a dedicated crawler that will regularly crawl the XKCD webpage and compile a list of new URLs as they occur.  Additionally a person(s) chosen by the project will also go over both the website and crawler generated list in order to make sure no entries were missed.  The results will then be compared to the Internet Archives’ collection and if there are any comics missing we will submit the appropriate copy/copies into the archive. Finally if any of the entries do not function properly in the Internet Archive they will be submitted to Archive It.  Overall using this method should guarantee the complete preservation of the web comic.

Conclusion

In conclusion XKCD is consider worth preserving for the future and that is best done by working with and assisting preexisting efforts to do so.  Not only is the web comic rich with cultural of its community but it also acts as an excellent record of their values and interests, making it very valuable to future researchers.  This makes the comic worth preserving and the best way for project to accomplish that is to work with the Internet Archive.  Not only is the Internet Archive already trying to preserve the web comic it also has all of the tools, services, and permission to do so.  This project can assist in this effort by acting as both a back-up and a form quality control, catching and submitting any missing entries and ensuring they function properly.  Overall this project fits a niche in the effort to preserve XKCD that needed to be filled.

2 Replies to “Saving XKCD for the Future: Statement of Preservation and Acquisition”

  1. It is great that you have started exploring the Internet Archive to get a sense of what they do or don’t have. With that said, it isn’t completely clear to me what parts of it that they don’t have. That is, can you identify some individual pages that they don’t have copies from ? The link you point to illustrates that there are a lot of dates for which they haven’t crawled, but given that XKCD posts stay up (more or less) in perpetuity it would seem like you should be able to see nearly all of the comics from one of the more recent crawls of the site. Don’t hesitate to ask if you have more questions on that or to push back if I’m missing part of what you are saying.

    To Smackow’s point, the comics are available under Creative Commons, so the rights perspective is likely all clear.

Leave a Reply

Your email address will not be published.