An Emulator Within an Emulator

If you feel compelled at any point while reading, click the image for the Inception 'bwaahhhh' sound.

So totally relevant to yesterday’s show-and-tell about game emulation and data forensics, some fans of Goldeneye for N64 found a working emulator of a ZX Spectrum 48k hidden inside the game. Apparently the developers wanted to see if emulating an older console was possible on N64 and left the files in the Goldeneye ROM.

For anyone who’s ever played Donkey Kong 64, the ZX Spectrum 48k emulator makes an appearance as a playable arcade cabinet in that game — both games were made by development house Rare.

Really interesting to know that this is an actual functioning in-game emulator.

So to recap, people running an N64 emulator on their computer discovered a ZX Spectrum 48k emulator in their emulated game.

[via NPR] Let’s Weigh The Internet (Or Maybe Let’s Not)

by 

It’s an odd question, but the answer is startling.

A few years ago, a physicist named Russell Seitz asked himself, “How much does the Internet weigh?” By which he meant, how much does the whole thing, this vast interlocking web of content pulsing through 75 to 100 million servers staggered all over the world, what’s its total weight?

The Internet is not something I would ever think to weigh. It’s like weighing a radio program. Why bother?

But Seitz did the math, and discovered that while the Internet sucks up gobs and gobs of energy, something like “50,000,000 horsepower,” if you put it on a scale, it does have a weight. The whole thing, he says, weighs “two ounces.”

Yup. That’s all.

Seitz says it weighs about as much as a fat strawberry. Others, recalculating, say the Internet’s even lighter, more like a teeny grain of salt.

How could something so huge in our lives weigh so little?

The answer is the Internet runs on electrons. That’s how the information is stored. And electrons are very, small. But they do have mass. Einstein taught us that. So it’s possible to take all the energy (E) powering the internet and, using Einstein’s equation, (E=mc2) turn that energy into something we can weigh.

And it turns out a lot of energy doesn’t weigh very much. [See my footnote for the mathematical details.] Consider, for example, an email message.

How Much Does An Email Weigh?

Make it an ordinary email (50 kb) like the one you wrote to a friend today. According to a new video from vsauce making the rounds, to store a typical email takes about 8 billion electrons.

Eight billion sounds like big number, but put them on a scale, and they weigh only about “two ten thousandths of a quadrillionth of an ounce.”

One Email = ‘Two Ten Thousanths Of A Quadrillionth Of An Ounce’

Sure, sure, but remember, the Internet contains gazillions of emails and videos and music files, and porn and libraries, and e-chat, and photos of grandchildren, boyfriends, girlfriends, and endless meditations on Justin Bieber. If we add all that up, how many electrons does it take to store the entire Internet? Hmmm?

Well, if we weigh everything — all five million terabytes of information, what do we get?

Again, not a lot: About 0.2 millionths of an ounce.

Minus the servers, information on the Internet is practically weightless.

And yet, does weight matter? Not really. Look what weightlessness can do.

When those electrons produce an image of a young woman lying shot in the street in downtown Tehran, shot by a sniper, falling to the ground, dead, that picture may weigh next to nothing, but the hundreds of thousands of people who see it are altered, literally changed, by what they’ve seen.

Neda Agha-Soltan lying on the ground after being shot in Tehran in 2009.

YouTube/Reuters /Landov

That photo creates a flood of electricity in their heads, altering brain cells, which sprout new dendrites that spit infinitesimally small chemicals from cell to cell, so now that woman in Tehran is locked in memory, etched into all those brains.

And the bearers of those brains want to share what they’ve seen, so the image gets passed to more and more people, and is copied, brain to brain to brain, machine to machine to machine.

This happened not just to the image from Teheran, but also that picture of that cop in California zapping kids with pepper spray, to Moammar Gadhafi surrendering, to women being stripped an beaten in Cairo’s Tahrir Square, to all those folks challenging Putin in Moscow.

A protester holds a portrait of Putin with the words read as " we are going different ways" during a mass rally to protest against alleged vote rigging in Russia's parliamentary elections in Moscow, Russia, Saturday, Dec. 10, 2011.

Mikhail Metzel/AP

Once things are seen and shared, people react, people gather, people march, people fight, and sometimes figures of enormous weight, a Gadhafi, a Mubarak, even a Putin can be toppled, or shaken.

And the electrons that make that happen, what if their weight is 0.0000000000000000001 grams? So what? You can weigh the Internet till you are blue in the face, but the grams won’t tell you anything important.

The Internet connects people. What it is doesn’t matter. What it carries, that matters. Ideas aren’t like chairs or tables. They have their own physics. They make their own weight.

So the Internet weighs about as much as a strawberry? It can still stop tanks.

Ask yourself: How much does “of the people, by the people, for the people” weigh?

Documenting the American South project

So far this semester we have discussed many digital History topics. From the discussions of digitalization of Civil War records in the article Crowdsourcing the Civil War and digital collections during our trip to the library to listen to a lecture from the university’s archivist on searching those collections to Rosenzweig’s article Scarcity or Abundance? Preserving the Past in the Digital Era and looks at such digital collections as the September 11th archive and the Wayback machine, we have learned a lot about digitization when it comes to certain collections. One digital collection that I would like to share with everyone is the Documenting the American South Project sponsored by the University Library at the University of North Carolina at Chapel Hill.

Documenting the American South (DocSouth) is a digital publishing initiative that provides Internet access to texts, images, and audio files related to southern history, literature, and culture. Currently DocSouth includes sixteen thematic collections of books, diaries, posters, artifacts, letters, oral history interviews, and songs. The project has been developing for over a decade with the aims of gathering and digitizing all materials related to Southern culture. Most of the collections come from Southern holdings.

The project dates back to 1996 with the Pilot Project to digitize a half dozen highly circulated slave narratives. The project is designed to provide digitized primary materials to researches, scholars, and students. These sources offer a Southern perspective on many parts of American history. The collections included in the project include: The Church and the Southern Black Community, The Colonial and State Records of North Carolina, Driving Through Time: The Digital Blue Ridge Parkway in North Carolina, The First Century of the First State University, First-Person Narratives of the American South, Going to the Show, The James Lawrence Dusenbery Journal (1841-1842), Library of Southern Literature, North American Slave Narratives, The North Carolina Experience, North Carolina Maps, North Carolina and the Great War, Oral Histories of the American South, The Southern Homefront (1861-1865), Thomas E. Watson Papers, and True and Candid Compositions: The Lives and Writings of Antebellum Students at the University of North Carolina.

As a personal note, I wrote my undergraduate thesis on Sherman’s March to the Sea during the Civil War largely with the assistance of the primary sources available in the Documenting the American South Project. This brings up a question that we have discussed in class. If primary sources are digitized these days, then can serious researchers and scholars base their research solely on these digital sources? Or does historical research require scholars to do in person research? This is definitely something that we have to think about in the digital era.

Show and Tell: Google Docs Text Conversion

Aside from doing (and knowing) everything, there is a neat feature on Google Docs that you might find immensely helpful if you find yourself doing archival research.

When researching at the archives, most people take digital photographs of the documents they’re looking at. At the end of a day you’re left with hundreds of photographs that you now have to sort through, catalog, read, and take notes on. Personally, I used to print all of my images and then spend hours going through them highlighting the key points and making general notes based on what the documents contained.

Google Docs now has a feature where you can upload a photograph of a document and it will convert it into keyword searchable text. It can be done completely automatically in two easy steps.

Step One: Select your image file to upload (it must be less than 2mb) Make sure you select: “Convert text from PDF and Image files”

Step Two: There is no step two. Google does everything for you. What you end up with is your uploaded image file like so:

And directly below the image will be your converted text:

The text conversion is never 100% perfect and the accuracy depends on how clear the original document is, but it is still far more efficient than manually transcribing or taking notes on each document. An additional benefit is that you now have an archived copy of all your files in case something happens to the originals. Plus you can also use the search feature on Google Docs to keyword search all of your documents at once in case you need some information you remember seeing but can’t remember what document it was in.

Show and Tell: The National Archives – DocsTeach

As we’ve seen throughout this course, there are various ways to connect the public to history using digital resources. Along with this, teachers are progressively acknowledging the significance of using online tools in the classroom to keep up with today’s generation of tech-savvy students. Perhaps a resource they may find useful is the National Archives’ online teaching tool for educators called DocsTeach.

DocsTeach allows teachers to create their own interactive activities, using primary sources from the National Archives digital vault to do so. Teachers are encouraged to create interactive maps, make sequential timelines based on primary documents and images, build connection-strings, match certain documents to a specific concept or topic, etc. These activities are especially valuable because teachers can assign activities for students to do themselves and later share with the classroom.

DocsTeach also offers multiple existing lessons created by the National Archives. Each lesson pertains to a certain historical era in American history, ranging from 1754 to the present. Within each are several activities relating to that lesson. For instance, the lesson Civil War and Reconstruction (1850-1877) contains six different activities in which students can compare civil war recruitment posters, assess the effectiveness of the Freedmen’s Bureau, use the Emancipation Proclamation and historical congressional records to solve problems, and search other documents to find out what else was happening during this period. Each activity is entirely primary source-oriented. Teachers are also given the option to reformat, or skin, each existing activity to create their own prototypes.

Each primary source is also categorized under a certain historical era to help teachers better find them to include within their activities. These categories consist of written documents, images, maps, charts, graphs, audio and video. There are thousands of sources to choose from, allowing teachers an even greater opportunity to create multiple activities for their disposal.

Thanks to the creation of DocsTeach, the National Archives has given educators an invaluable resource to connect students to history using digital tools. This is a fantastic resource that anyone hoping to enter the academic field should consider using in the future.

 

Kirschenbaum’s Mechanisims Chapters 1 & 2

The first two chapters of Kirschenbaum’s book engage in an in-depth study of not only how digital storage works, but more importantly, how we interact with it. Although highly technical, Mechanisms outlines much interesting information and raises many thought provoking questions.

Ever since the first public debut of the UNIVAC computer in 1952 (which tipped the scales at 13 metric tons and entered the public mind by correctly predicting the outcome of the 1952 Presidential election) we have dealt with information in a new way. Kirschenbaum is interested in how digital technologies have progressed from the public spectacle of UNIVAC to being so ubiquitous that digital technologies in the form of information storage can be carried around in your wallet. (Credit cards, Metro passes, etc.)

To most people, digital storage is something we know exists but never see. (Unless you’ve had a reason to take apart your computer before, most people have never physically seen a hard drive.) Kirschenbaum traces the evolution of digital storage in order to make an interesting point. Despite the “advanced” nature of even early storage media – the floppy disk – the physical media still needed to be able to be read by both machine and human. One needed a hand-written label on the disk in order to know what was contained within. Although later iterations such as the CD could store immense amounts of information, it was still necessary to label them. Once copied and labeled with a maker, CDs were then treated tenderly and stored carefully in cases to avoid damage.Kirschenbaum points out that the physicality of storage determines its value. CDs are fragile and are treated delicately. No one would ever think of throwing an uncovered CD into a bookbag and carting it around all day. Yet, we routinely do that with USB drives without ever worrying about the information on it getting damaged. (Nor ever labeling them because they are easily manipulated.) The information contained within these two storage mediums could be identical, yet our interaction with them is very different.

Kirschenbaum is interested in the physical materiality of digital media. It used to be that digital files were physically “yours” in that you would save them to a floppy disk and carry them around with you. Today, all the files we are using for this class reside on the hard disk of a public computer somewhere. We don’t keep them in our possession when we leave this room, and we trust that they will be there when we return next week. More importantly, digital technologies have changed the very nature in which we conceived of certain processes. Take typing for example: when typing on a typewriter there is a physical process that you are involved in. You push the key, and you can see parts of the typewriter move and work to produce what you asked it to do. With a computer, all of these physical actions are hidden away and we only remain on the surface of the screen. The act of “writing” has become completely immaterial. We experience “digital writing” fundamentally differently than we do “analog writing.”

Getting into different types of storage, Kirschenbaum points out that although they serve the same purpose, storage technology such as the floppy disk and CD are immensely different from the hard drive – not only in their physical properties, but in the way we conceive of them. The hard drive resides in a sealed case deep inside your computer, and aside from the occasional clicking sounds, you interact with it only through icons on a screen. The hard drive was revolutionary in that it replaced the previous magnetic tapes and punch cards. Tapes were useful at running programs that were always run in sequence (such as weekly payroll with its alphabetical list run in the same order each week) but were not able to process user-generated information randomly. The invention of the hard drive allowed for much more complex variable processes such as inventory control where different items get sold randomly and in different quantities each time. Computers with hard drives could randomly access different items without running through the same entire process each time.

The proliferation of computers throughout everyday life has led people to interact with information in a very different way than in the past. We seem to have a notion that digital information is ephemeral and untrustworthy. However, one of Kirschenbaum’s chapters outlines how digital information is much more resilient than we might expect. Every email you send leaves a copy of itself across countless different servers as it makes its way to its destination. Every Word document that you delete still has multiple “autosave” copies that remain hidden on your hard drive.

Kirschenbaum’s Mechanisms: Chapters 3 & 4

In presenting the complex discourse, terminology, and debates surrounding the issue of text, Matthew Kirschenbaum’s purpose of writing this monograph is didactic in that he hopes to add “to the repertoire of activities we are able to perform as scholars of electronic literature and digital culture” (115).

Chapter 3 is a case study about the game Mystery House, its disk image and its entirety of digital information. Mystery House, created in 1980, is a computer game in which the player is locked inside an old mansion with the goal of finding a supply of jewels. However, as the game progresses, bodies of other people inside the mansion start appearing and foul play is at hand. It is up to the player to figure out who the killer is before the player is killed himself.

As Kirschenbaum notes, the space of the disk is finite and capable of holding only a certain amount of textual information. Within this world of text, Mystery House features a multi-layered environment for its players that help to engineer the codes that drive the game itself. This balance and interplay of textual forensics at work is one of the focal points of Kirschenbaum’s arguments. For example, when the player is inside the mansion gathering clues and notes about the committed murders, the game’s instructions point out: “a note of caution: carrying more than one note may be confusing as the computer will arbitrarily decide which one to read or drop” (131). As Kirschenbaum points out, this interaction created through the programming of the notes’ behaviors is a basic component of all digital media (132).

The illusionary world created by the computer itself in this game is also Kirschenbaum’s focus. He calls this “formal materiality.” Through this process, computers generate many signals that get processed in the span of a few mere milliseconds before viewers can detect glitches, errors, or misunderstandings in both the computer’s and game’s functions. Through formal materiality, computers give viewers the illusion of perfection when in fact there are many digital misunderstandings occurring within the computer itself. The first example of this illusory view took place in the late 1930s when German citizen Alan Turing created the world’s first computer. Kirschenbaum differentiates between the formal materiality of digital media from the forensic when he claims that computers and their processes can be proven to be identical while forensically they are more individualistic (157).

Chapter 4 deals with Kirschenbaum’s case study of Michael Joyce’s work of hypertext fiction entitled Afternoon.  According the to Wikipedia website, Afternoon tells the story of Peter, a recently divorced man who witnessed a car crash that may or may not have involved his ex-wife and their son.[1]

Through studying the digital aspects of this electronic text, Kirschenbaum reveals how Joyce’s allows him to analyze the “material negotiations” that comprise the text of Afternoon. Kirschenbaum points out that Joyce’s Afternoon walks a fine line by depending on the reader’s active engagement yet Joyce controls the engagement in that a vast array of choices, deceptions, and vagaries envelop the reader (165).

Kirschenbaum goes into detail about the hypertext writing environment of Storyspace. This complex atmosphere emerged from “computer fiction, artificial intelligence and story generators, word processing, desktop publishing, hypertext systems research, and interactive videodisc technology” (177). Because it leaves behind such a trail of evidence, Kirschenbaum argues that Storyspace is highly accessible for further study.

Kirschenbaum does a fine job in describing the digital nuances, methods, features of both Mystery House and Afternoon in chapters 3 and 4. What would have been more preferable would for Kirschenbaum to give an overview of both games and texts and to describe how the electronic nuances, functioning internally during these games and texts, would impact the viewer or readers from the outside. In other words, I found Kirschenbaum’s chapters too internally focused and without implications for how these internal digital functions affect players of these games. More examples from these games from a player’s perspective would have been a welcome addition.


[1] Wikipedia.com

Google Custom Search Engine

This tool is pretty self explanatory.

What you do in a couple easy steps is customize a Google-powered search engine to a set criteria of what sites your would like to search

To start you go to www.google.com/cse/

You’ll be presented with a very clean looking 3-part form with the steps of

  1. Naming, Describing, & Specifying Sites
  2. Testing it
  3. Getting the code
This is all you need to fill out to create your search engine

 

For adding a CSE to a blog or website that searches that website it’s even easier.  You just go here (http://www.google.com/cse/tools/create_onthefly), copy the code, paste in the HTML, and BAM… you’re done.  From there of course you can further customize it.

The best resource I found was their GSE for Educators Guide

So again you can just create you’re own engine that you can use and share with others that would look like this…

Or embed it in your site/blog

Another good example of a website that uses a GSE is www.american.edu  if you search anything you will notice on the results page the Google logo.

Something that I feel is interesting though is customizing the google search engine in a reverse engineering way- which can be especially usefully for searching for something just once.  The infographic below explains in detail below- click on it to receive more information.

Despite the annoyance caused by the length of the infographic I feel like these skills are good to know.

Show and Tell: Google Trends

Since we are on the topic of searching databases this week, I thought I would share another Google database: Google Trends.

Google Trends is a database of Google web searches and of Google News.  Type in a word or words into the search engine and Google generates a chart, similar to Google Ngram, that displays how often these terms have been searched over time on Google. The bottom of the chart shows how often these terms have appeared in Google News.

Since Google Ngram and Corpus Time focuses on books and the written word, Google Trends is a nice complement to searching what ordinary people (i.e. non-authors) have been interested in over the years.  Another great aspect of Google Trends is that it ranks regions, cities, and languages in which people have searched for certain terms the most.

I did a Google Trends search on “Barack Obama” and most people searched for “Barack Obama” during the 2008 election. Since his inauguration, interest in Obama has steadied with a peak around the time Bin Laden was killed. After Americans, the Irish search the most for “Barack Obama.”  Interestingly, Swedish is the second most language in which people search for Obama.

Google Trends is relatively new and you can only see trends in Google searches since 2004.  As time goes on, it will be interesting to see how this tool evolves and what trends in web searches develop over time.  Do you think Google Trends is a valid historical methodology? Do people’s web search interests merit historians’ attention?

 

 

The Corpus of Historical American English and Google Books

The Corpus of Historical American English and Google Books/Culturomics

 

Let me first start off by defining a corpus. A corpus is a large or complete collection of words and writings. This blog post is about three recently created resources which serve as a corpuses of American English words. The Corpus of Historical American English (COHA), Google Books standard, and Google Books advanced are the three resources that are compared here.

Let me first start by giving a little background on each corpus. COHA was created by Mark Davies of Brigham Young University with funding from the US National Endowment for the Humanities and was released in 2009. COHA contains 400 million words from 1810 to 2009 and is one of the largest structured corpuses of historical English. This online resource allows you to search through more then 400 million words of text of American English. You can see how words, phrases, and grammatical constructions have increased in frequency, how words have changed meaning over time, and how stylistic changes have taken place in the language. You can also download an offline interface to use.

Google Books standard was also created by Mark Davies of BYU. It was released in October 2010. This contains 155 billion words, but does not have as wide a range of searches as COHA. In May 2011, Google Books BYU/Advanced version was released. This new interface allows you to search the same amount, 155 billion words in American English, including 62 billion words from 1980-2009. This new advanced interface is a hybrid of COHA and Google Books Standard version. It is much more advanced than the original Google books interface. You can search by word, phrase, substring, lemma, part of speech, synonyms, and collocates. You can also easily compare the data in two different sections of the corpus. Although this corpus is based on Google Books data, it is NOT an official product of Google or Google Books.

COHA is a lot smaller than both Google Books interfaces but offers an extremely wide range of searches. In terms of exact words and phrases, all three resources give nearly the same results for these searches. COHA is probably sufficient for searches for exact words and phrases.  However, with the standard Google Books interface you get less information on frequency and related phrases than the other two interfaces.  The Standard Google Books interface is also limited with related words and cultural insights, whereas COHA and Advanced Google Books allow you to do more interesting and useful searches like finding all words with the suffix “ism.”

You can also search for concepts, not just exact words and phrases. With COHA and the Advanced Google Books interface you can use built-in synonyms to search for the frequency of concepts. But with the standard interface you can only look for exact words and phrases. COHA and Advanced Google Books also allow you to search changes in meanings, collocates and natural shifts, function of words, grammatical change, and language change and genre.

Overall, the Standard Google Books interface is very neat, but it all it does is allow for the search of the frequency of words or exact phrases over time. Whereas, COHA and the Advanced Google Books interface allow for much broader and more interesting searches. Why are these interfaces important? A comparison of words, phrases, etc. gives us, as historians, great insights into cultural, social, and historical changes in American English throughout different periods of history.  These interfaces are very interesting and provide us with a valuable source of a part of history that many people ignore: the history of words, aka the history of American English. Check out the interfaces and you might just be surprised at what you find!