Google's home page, November 1997

Retro Google?! This is one of the first Google home pages circa November 1997 courtesy of the Wayback Machine. Wayback Machine is an internet archive of web pages from 1996 to the present.  It is run by Internet Archive, a non-profit organization that began archiving web pages in 1996. Internet Archive collects web pages through web crawling. Web crawling creates copies of web pages and Internet Archive archives these copies. The public can access this archived material through the Wayback Machine. The name comes from Mr. Peabody’s WABAC machine in the Rocky and Bullwinkle cartoon show.  Today, the Wayback Machine contains two petabytes of data–more text than is at the Library of Congress.

The greatest asset of Wayback Machine is that it is extremely easy to use. On the home page, type in any URL of the website you wish to see. You are then taken to an interactive calendar.  Pick a year at the top, and click on a month and day on the corresponding calendar.  Click on a date and go back in time and view what a web page looked like on that date. An interactive calendar on top allows you to surf between pages quickly and easily. One drawback of the Wayback Machine is that you cannot search full text or by keywords. You can only search by typing in a specific URL. They hope to implement these features in the future.

The Wayback Machine is an invaluable source for historians. In fact, the mission of the Archive is to preserve digital artifacts for future use by researchers, historians, and scholars. The general public can also use and research this archive because it is extremely accessible and easy to use. Visitors to the site can look through hundreds of web pages, if only to gawk at how far the internet has come (just look at how far Google graphics have come with their Halloween images!).

Google's home page, October 31, 2002
Google's home page, October 31, 2008

Nostalgia aside, the Wayback Machine is not only a great asset for current research, but will be a wonderful source of research material for future researchers. However, after reading Roy Rosenzweig’s “Scarcity or Abundance,” we have to be wary of these web archives.  In particular, these web crawls archive sites in their original format. If technology evolves too quickly, will future historians be even able to access these pages?

Barackobama.com home page, December 12, 2007

There is so much source material on Wayback Machine that can be used by future historians.  For example, you can look at what President Obama advocated for in his 2008 campaign by looking at his campaign website. On the flip side, the Wayback Machine also has compiled collections on specific archived material, such as Hurricane Katrina.  Public historians, then, can also use this archive to display certain materials.  How else can historians use these digital archives?

4 Replies to “Go Wayback with the Wayback Machine”

  1. Meghan—I think you raise an interesting point when you point to the Obama campaign site from four years ago. With information increasingly being digitized, it becomes harder to track changes over the years considering the easily alterable and non-tangible nature of the web. I guess the Wayback Machine allows this.

    It’s also interesting to see how digital archives like the Wayback Machine are actually put to use in the real world. I know that the Wayback Machine in particular is used it IP lawsuits quite frequently; I dug up an old NYT articles that covers this very issue. Always interesting to see how various tools can be used beyond their stated purpose.


  2. A while back I came across a spoof (Probably on The Onion) asking people to print webpages and mail them in to a company creating a “book version” of the internet. The internet by its very nature is extremely ephemeral (which is sometimes a good thing) so it’s interesting to see the Internet Archive actually backing up a copy of previous web pages. I’ve often been annoyed at news websites that print “breaking news” that contains information that is simply wrong, only to come back later to find the article completely changed with no explanation or note that the previous version was incorrect. (Unlike with print newspapers which would have to print a retraction and explanation of the mistake in the following issue.) Perhaps the the Wayback Machine, if it became more widespread could serve to make the internet a little more honest?

    Even more interesting to me is that the same archive is still printing a copy of each book they archive: http://www.treehugger.com/clean-technology/internet-archive-begins-backing-up-books-on-paper-huh.html

    They cite many of the same reasons Meghan does: formats change so quickly, digital storage isn’t perfect, etc. Since it is readily admitted that digital storage isn’t perfect – and especially troublesome in terms of long term storage – I really like this idea of keeping a paper copy of every archived book as a sort of “seed bank” just in case.

  3. Meghan,

    While the inability to search text is frustrating, I think this site can be useful in looking at how the pages were formatted: why were certain topics or images placed in particular places? Who or what is at the top? In public history we’ve been talking a lot about people not reading text from top to bottom and the top being the most important part of print in an exhibit, what do these websites tell us about what was at the top of web pages? Since web pages can offer links to more information, how does that influence the decision to put certain things in certain places on the page? More succinctly, what type of hierarchy exists on the page? What is driving that hierarchy?


  4. I agree that the Wayback Machine is an interesting tool that could help scholars in certain areas of research. As Meghan mentioned, however, one of the potential problems with the site is that technology continuously evolves, rendering certain features of old websites unusable. I looked at pages from cnn.com and whitehouse.gov from 1999 and 2000, for example, and many of the features (images, certain links, etc.) were unusable. In a time span of less than fifteen years, technology has changed to such an extent that these older websites are now essentially useless. It begs the question, what other sites or digital tools will be inaccessible fifteen, twenty-five or even fifty years from now? Simply because of the changing nature of technology, I hesitate to say that the Wayback Machine will be a useful tool for historians years down the road.

