Digital Preservation Policy: Web Archiving for the Washingtoniana Collection

Introduction:

In my previous posts on this blog I have surveyed the digital preservation state of the District of Columbia Public library’s Washingtoniana collection. This survey was preformed via an interview with Digital Curation Librarian Lauren Algee  using the NDSA levels of digital preservation as a reference point.

In our survey we discovered that the DCPL Washingtoniana collection has very effective digital preservation which through a combination of knowledgeable practices and the Preservica service (an OAIS compliant digital preservation service) naearly reaches the 4th Level in every category of the NDSA levels of Digital Preservation. With this in mind my next step plan for the archive looks at a number of areas the archive has been interested in expanding and presenting some thoughts on where they could begin taking steps towards preservation of those materials.

Of particular interest in this regard is the collecting of website materials. Being dynamic objects of a relatively new media, collecting these items can be fairly complex as it is hard to precisely pin down to what extend is a website sufficiently collected. Websites may appear differently on different browsers, they may contain many links to other websites, they change rapidly, and they often contain multimedia elements. As such outlined below will be a policy which discusses these issues and specifically offers a digital preservation plan for websites.

Website Digital Preservation Policy for the Washingtoniana collection

The Washingtoniana collection was founded in 1905 when library director Dr. George F. Bowerman began collection materials on the local community. The collection stands as one of the foremost archives on the Washington, D.C area, community, history, and culture. Naturally it makes sense then with the increasing movement of DC social life and culture to online or born digital platforms that the Washingtoniana collection would consider collecting websites.

Selection

The same criteria for determining selection of materials for Washingtoniana materials should apply here. Websites should be considered if they pertain to Washington, DC or its surrounding areas, events that take place in or discus that area, pertain to prominent Washington D.C. related persons, DC related institutions, or websites otherwise pertaining to Washington D.C. community, arts, culture, or history.

Like any physical preservation decision, triage is an essential process. Websites that are likely to be at risk should be high priority. In a sense all web content is at risk. Websites that are for a specific purpose, or pertain to a specific event may have a limited operational window. Websites for defunct businesses, political election sites, and even an existent website on a specific day may be vulnerable and thus a candidate for digitization. In addition the materials in question should not be materials which are being collected elsewhere, and should be considered in relation to the rest of the collection.

Although automation tools may be used for identification, discretion for selection is on librarian hands. In addition, suggestions from patrons relevant to the collection should be considered, and a system for managing and encouraging such suggestions may be put in place.

Metadata

A metadata standard such as MODS (Metadata Object Description Standard ) should be used to describe the website. MODS is a flexible schema expressed in XML, is fairly compatiable with library records, and allows more complex metadata than Dublin Core and thus may work well. Metadata should include but not be limited to website name, content producers, URL, access dates, fixity as well as technical information which may generated automatically from webcrawlers such as timestamps, URI, MIME type, size in bytes, and other relevant metadata. Also, extraction information, file format, and migration information should be maintained.

Collection

A variety of collection tools exist for web archiving. The tool selected should be capable of the below tasks as outlined by the Library of Congress web archiving page

  • Retrieve all code, images, documents, media, and other files essential to reproducing the website as completely as possible.
  • Capture and preserve technical metadata from both web servers (e.g., HTTP headers) and the crawler (e.g., context of capture, date and time stamp, and crawl conditions). Date/time information is especially important for distinguishing among successive captures of the same resources.
  • Store the content in exactly the same form as it was delivered. HTML and other code are always left intact; dynamic modifications are made on-the-fly during web archive replay.
  • Maintain platform and file system independence. Technical metadata is not recorded via file system-specific mechanisms.

A variety of tools are capable of this task, a web crawler such as the Heritrix open source archival webcrawler or a subscription solution Archive-IT should be used. Both are by the Internet Archive, however the first is more of an open source solution while the second is a subscription based service which offers storage on Internet Archive servers.

Upon initial collection fixity should be taken using a Checksum system. This can be automated either with a staff written script or a program like Bagit, which automatically generates fixity information. This information should be maintained with the rest of the metadata for the digital object.

Websites should be kept in the most stable web archival format available. At the moment of this posts writing that format should be the WARC (Web ARChive) file format. This format allows the combination of multiple digital resources into a single file, which is useful as many web resources are complex and contain many items. Other file formats may be accepted if archived webpages are received from donors.

Preservation

Upon initial ingestion items may be kept on internal drives, and copied to at least one other location. Before the item is moved into any further storage system the file should be scanned for viruses, malware, or any other undesirable or damaging content using safety standards as agreed upon with the division of IT services. At this point fixity information should be taken as described above, and entered into metadata record.

Metadata should be described as soon as possible, as to which point the object with attached metadata should be uploaded into The Washingtoniana’s instance of Preservica.

Although Preservica automates much of the preservation process, a copy of the web archive should be kept on external hard drives. On a yearly interval a selection of the items within the harddrive should be checked against the items in Preservica to insure the Preservica fixity checks and obsolesce monitoring are working as desired.

References

Jack, P. (2014, February 27). Heritrix-Introduction. Retrieved November 14, 2016, from https://webarchive.jira.com/wiki/display/Heritrix/Heritrix#Heritrix-Introduction
Web Archiving-Collection development. (n.d.). Retrieved November 16, 2016, from https://library.stanford.edu/projects/web-archiving/collection-development
The Washingtoniana Collection. (n.d.). Retrieved November 16, 2016, from http://www.dclibrary.org/node/35928
Web Archiving at the Library of Congress. (n.d.). Retrieved November 16, 2016, from https://www.loc.gov/webarchiving/technical.html
Niu, J. (2012). An Overview of Web Archiving. Retrieved November 16, 2016, from http://www.dlib.org/dlib/march12/niu/03niu1.html
AVPreserve » Tools. (n.d.). Retrieved November 17, 2016, from https://www.avpreserve.com/avpsresources/tools/
Kunze, J., Bokyo, A., Vargas, A., Littman, B., & Madden, L. (2012, April 2). Draft-kunze-bagit-07 – The BagIt File Packaging Format (V0.97). Retrieved November 17, 2016, from http://www.digitalpreservation.gov/documents/bagitspec.pdf
MODS: Uses and Features. (2016, February 1). Retrieved November 17, 2016, from http://loc.gov/standards/mods/mods-overview.html
About Us. (2014). Retrieved November 17, 2016, from https://archive-it.org/blog/learn-more/

 

The Three C’s of Digital Preservation: Contact, Context, Collaboration

Three big themes I will take from learning about digital preservation: every contact leaves a trace, context is crucial, and collaboration is the key.

“Every Contact leaves a trace”

Matt Kirschenbaum and an optical disk cartridge in 2013.
Matt Kirschenbaum and an optical disk cartridge in 2013.

Matt Kirschenbaum’s words (or at least his interpretation of Locard’s words) will stick with me for a long while.  That when we will look at a digital object for preservation, we need to consider what it is we are looking at, and know that what we see is not necessarily all that there is.  Behind the screen there is a hard drive, and on that hard drive are physical traces of that digital object.  There is a forensic and formal materiality to digital objects – what is actually going on in the mechanical/physical sense versus what we see and interpret from those mechanical processes as they are converted to digital outputs.  We cannot fall into the trap of screen essentialism – of only focusing on the digital object as it is shown on our screens, without taking into consideration the hardware, software, code, etc. that runs underneath it.  

Which leads into my next point, about platform studies.  I am really intrigued by this idea that as digital media progresses we are seeing layers and layers of platforms on top of platforms for any given digital object.  The google doc that I wrote this blog draft in is written using Google Drive (a platform), which is running on my Chrome browser (a platform), which is running on Windows 7 (a platform).  These platforms can be essential to run a particular digital object, and yet with platforms constantly obsolescing or upgrading or changing, these platforms cannot be relied upon to preserve all digital objects.  Especially since most platforms are proprietary and able to disappear in an instant.  For example, my Pottermore project was spurred by the fact that the original website (hosted on the Windows Azure platform as well as the Playstation Home) had vanished and was replaced with a newer version.  If I had more time I would have liked to further develop the project by exploring the natures of the different platforms used by Pottermore, like Windows Azure and Playstation Home, and how those platforms influenced the experience of the game.

Context is Crucial

If content is king, context is queen!
If content is king, context is queen!

There’s no use in saving everything about a digital object if we don’t have any context to go with it.  Future researchers who have access to the Pottermore website files can examine them thoroughly and still have no idea why Pottermore was so important.  For this reason it is important to capture the human experience with digital objects.  Whether using oral history techniques or dance performance preservation strategies, there need to be records that try to capture the experience of using the digital work.  This can include interviews with the creators, stories from the users, Let’s Play videos, the annotated “musical score” approach so that a work can be re-run in a different setting.

This is really what the Pottermore project was about: providing context to the website that is all but lost to us.  In case the game does reappear, there will not be materials like the Pottermore Wiki and the Let’s Play videos that can explain how the game was played.  Furthermore, it can help future researchers realize the sense of community of the Pottermore users, and why they reacted so negatively when the old website was replaced.

Collaboration is the Key

Pottermore was a collaboration of many different entities, including JKR, Sony, and Microsoft.
Pottermore was a collaboration of many different entities, including JKR, Sony, and Microsoft.

There are a number of roles played by different people in digital preservation, and these roles are conflating and overlapping.  The preservationist may be the user who is nostalgic for an old game and so creates an emulation program for it.  The artist may use feedback from the users and incorporate it into their next work.  The technological expertise of IT folk may need to be ascertained in order to understand how to best save some works – in what formats, in which storage devices, etc.  Archivists and librarians may be the fans themselves, contributing to the fanfiction community that they are trying to preserve.  With funding only getting tighter and tighter and the digital world growing more complex, collaboration is going to become essential for a lot of digital preservation projects.    

What next?

Best practices, next exit sign
We’ll get here eventually… right?

Of course this leaves us with many unanswered questions.  How do we balance out the roles of different experts? How do we match the large scale of digital works on a limited budget? How much context do we need to give a certain work? In almost all cases the answer is going to be “it depends.” But these are questions that I am excited to figure out as I go on in the field.  

Pottermore – the Archival Information Package

I was able to put my Preservation Plan into action by uploading a Pottermore Collection to the Internet Archive in addition to saving the collection on my laptop. Here’s a brief recap of my Preservation Plan:

  • Capture this YouTube video that announced the launch of Pottermore in 2011, saved by the youtube-dl downloader.
  • Archive the Pottermore Wikia, using their own archiving tools to download the xml files.
  • Download the images from the Pottermore Wikia separately, since the xml files don’t include them.  This was going to involve the command line method, or if that didn’t work, to curate a selection of images from the collection.
  • Save this Pottermore entry from the Harry Potter Wikia, which details the description and history of the site.
  • Save Let’s Play videos that can be found on YouTube to capture the interactivity of Pottermore, using the youtube-dl downloader.

I’ve officially uploaded what I’ve collected so far to the Internet Archive, check it out here: https://archive.org/details/Pottermore.

Internet Archive Pottermore
What my Internet Archive collection looks like!

The first file I included was a PDF of the Pottermore entry from the Harry Potter Wikia.  This entry gives a full description and history of Pottermore.  I concluded that since it was only one entry, and the text is what matters more than anything else, a PDF would suffice.  The next folder includes a selection of images from the Pottermore Wikia.  This is what I was really happy about, since this is a feature that a lot of people enjoyed from the first Pottermore that isn’t as present in the newer version.  Since I couldn’t figure out that command line method that I had written about in my Preservation Intent Statement, which was supposed to capture all of the images from a Wiki, I had to go through one by one on the Pottermore Wikia image directory and download them.  Since there are 51 pages of images, with each page containing at least 40 images, I will be uploading one page’s worth of images at a time (as of this post, I have two pages’ worth of images uploaded to the Internet Archive). I save all of the images in their original format, which are either .jpg or .png files.  The final folder contains the XML files of the Harry Potter Wikia, which I had downloaded using the tools provided by the Wikia itself.

What I did not upload to the Internet Archive (due to copyright uncertainties) but have saved to my Pottermore folder on my computer are the videos.  I used the youtube-dl downloader to save the Pottermore launch video from 2011 as well as some Let’s Play videos to capture the experience of playing Pottermore.  All of the videos were saved in .mp4 format.

Below is a screenshot of the collection I have on my computer:

screenshot of my Pottermore collection
Screenshot of my Pottermore collection on my laptop.

I arranged the folders according to the different aspects of Pottermore that were saved.  The first folder contains the history of Pottermore, which includes the Harry Potter Wikia entry.  The second folder involves the Let’s Play videos, which capture the experience of playing Pottermore.  The next folder contains the Pottermore images, which are either in .jpg or .png format.  Some of the images are labeled either with descriptions, usually the names of the characters in the images (for example, “Hokey” or “Hooch”).  However, most of the images are named after their location within Pottermore.  For example, B1C11M1 = Book 1 (Harry Potter and the Sorcerer’s/Philosopher’s Stone), Chapter 11 (“Quidditch”), Moment 1 (“Charms Homework”).  This will help orient the viewer as to the order of images within Pottermore.  The next folder is Pottermore Launch, which includes the 2011 YouTube video that announced the coming of Pottermore.  The final folder contains the Pottermore Wikia pages in xml format.

What this collection really comes down to is trying to capture the essential elements of a website that, for our present purposes, no longer exists.  I am hoping that with the xml files of the wiki, the images that provided the interactive layers, and the let’s play videos that show how the game was played, that this goal was accomplished.

Pottermore: A Statement of Preservation Intent

It’s time to collect all of the horcruxes that remain of the old Pottermore.  Not to destroy them, but to save them.

Not that I’m saying that the old Pottermore was evil or needed to be killed off.  In fact, the situation is rather the opposite of Voldemort’s, in that here the main character (the old website) has been “killed off,” but pieces of it are still left behind.  And these are the pieces I think are worth saving.

In a magical world, I would save the original files of the website, make bitstream copies of them and save them in different places, including open source cloud storage and on hard drives.  I would interview the original Pottermore team, including JK Rowling, Sony, TH_NK the UK digital company, and Windows Azure in order to document the creation of such a unique project.  Also in this magical world I could pull a Fawkes and resurrect the old Pottermore, by bringing it back under another URL and hosting it on the same Windows Azure platform (or ideally an open-source platform) so it could coexist with the new version.  Old users can finish up their journey through the books, and new users can begin theirs, and Pottermore could live on longer than Nicolas Flamel.

Unfortunately, no Alohomora spell is going to unlock the old Pottermore website anytime soon; it seems to be kept under tight lock and key by JKR and her Pottermore team, with very little chance of ever seeing it again.  Several snapshots of the website are preserved on the Internet Archive using the Wayback Machine.  However, the functionality and interactivity is removed from it.  So someone can see what the website looked like, but even then sometimes it doesn’t load properly.  Therefore, I have decided to go after the “horcruxes” – the magical traces of Pottermore’s soul left scattered across the internet.  And thus follows a plan…

My ultimate goal is to collect the pieces together in one place, not to destroy them (as Harry did to the horcruxes), but to preserve them.  So in the future, when fanatical Harry Potter historians like myself want to study all things Harry Potter, this will be available to them.  Especially since it is JK Rowling’s first contribution to the online world of Harry Potter.  What I’m especially trying to capture is the context surrounding Pottermore, including the user’s perspective and the users’ reaction to the disappearance of Pottermore.  This way in case the old website is ever resurrected, there will be enough materials to show future users/researchers how it was played and experienced.

The first step is to save this video released by JK Rowling announcing the launch of Pottermore in 2011.  The original video released by Pottermore is no longer available (it has been turned to “Private”) but this was the highest quality one I could fine.  This was the first peek into what Pottermore was – a hint about “a reading experience unlike any other” involving the author and the reader. I have actually already saved this by downloading it using ClipConverter, which allowed me to download it in .mp4 format, and that I now have saved in a folder called “Pottermore” on my desktop.  

The next step is to archive the Pottermore Wikia.  This was pretty much a step-by-step guide to everything that could be found on the old Pottermore – you can follow moment by moment to see all that can be collected and done on the site.  Luckily they have their own archiving tools that I can use under a Creative Commons license.  This archiving tools includes the current pages and the history of each page.  The wiki is downloaded into a compressed XML file, which I can then decompress with a tool like 7-zip.  

The end of year feast and the house points from the beta version of Pottermore, taken from the Pottermore Wikia, 2011.
The end of year feast and the house points from the beta version of Pottermore, taken from the Pottermore Wikia, 2011.

The images from the archive would need to be archived separately.  There are 51 pages of images, which adds up to over 2000 images, so I haven’t decided how to go about doing this.  I have found one wiki page that seems helpful but in case that doesn’t work out my plan is to make a selection of the highest resolution images from a variety of Pottermore moments and save them in JPEG format.

I would also like to save this page, an entry on Pottermore from the Harry Potter wiki, which gives a very detailed history on the launch of Pottermore, the revisions and changes done over the years, the full results of all eight House Cups, and the change from the old Pottermore to the new one.  Essentially, it provides all of the background context I need to support the other materials in this collection.  Since there is only one page that I want to save (as opposed to an entire Wiki) I have saved it as a PDF and have added it to my Pottermore folder.

Next would be to capture the interactivity of Pottermore.  Fortunately there is a lot of documentation out there that records people’s experience with Pottermore.  These include Let’s Play videos and subreddit posts, I will archive Let’s Play YouTube videos like this one in the same manner I used for the Pottermore Announcement video, downloading them as .mp4’s and saving them to my Pottermore folder.

There is also an entire subreddit r/Pottermore that was full of posts with troubleshooting questions, favorite moments, glitches, etc. that I would like to capture.  I have posted in this subreddit asking everyone what was important and/or special to them about Pottermore.  I would then save the replies to this post, probably as a PDF.

The final step: once I’ve downloaded all that needs to be downloaded and have all of the files saved on my computer (and probably on a USB drive), I will upload them to the Internet Archive as a Pottermore collection.  I probably won’t include the YouTube videos due to copyright issues, but the Wiki pages, images, and the Reddit posts will be saved there.  I’ve just signed up for an account with the Internet Archive, so this week I will try to become more familiar with the platform as I save/download all of the materials for my future collection.  Additionally, I’m working out how to include annotations or metadata to give more context to the materials I’m uploading – descriptions for the images and the videos specifically.  Now if only I had a magic wand that could do all this work for me… 

Why is who saving what, and how?

It seems that when it comes to preserving born digital works, certain questions need to be raised.  In fact, a lot of questions need to be raised since there is no established consensus on which formal framework to use.  There’s the question of “who,” involving the roles different people play in the lifetime of a work.  This includes the artist, the curator, the preservationist, and the consumer/audience. Next there’s the “why”: what makes this work worth saving, and why did we choose certain components of the work to save? Next comes the “what” part: what exactly do these groups decide to save, and what is it that we are actually saving about this work? And finally there’s the “how”—putting a preservation plan into action.

The “who”: Creators, Curators, Conservators, and Consumers

First comes the artist, who creates the work.  The artist makes the initial creative decisions that make his/her work unique, whether intentionally or incidentally. Next comes the curator, who decides that the work is worth collecting and exhibiting and defends the work’s significance.  After that is the preservationist or conservator, who determines what to preserve and how.  Finally there is the audience/consumer and their role in supporting the work.

What makes born digital works so complex is that the roles of these various groups are often bleeding into each other: the artist creates an interactive work that allows the consumer to feel a sense of authorship in making unique decisions that affect the work; the conservators are now asking for statements of intent from the artists to hear their feedback on what’s significant about the work; and fans of a work can prove crucial in providing the emulation software necessary for preserving that work.

Furthermore, as Dappert and Farquhar insist, different stakeholders place their own constraints on a work.  For instance, Chelcie Rowell discusses how Australian artist Norie Neumark used a specific software called Macromedia Director for her 1997 work Shock in the Ear. The audience who experienced it originally had to load a CD-ROM into their computer, which could have been a Mac or Windows.  The preservationists chose emulation as the best method to save works like this one, and these emulators were created by nostalgic enthusiasts.  So each of these people involved placed constraints on the original work, in terms of hardware, software, and usage.  And these constraints changed from its creation to preservation. Dianne Dietrich concludes with this in regards to digital preservation:

“As more people get involved in this space, there’s a greater awareness of not only the technical, but social and historical implications for this kind of work. Ultimately, there’s so much potential for synergy here. It’s a really great time to be working in this space.”

For this reason, it is becoming more important than ever to document who is doing what with the work, increasing accountability and responsibility. Which leads to…

The “why”: Preservation Intent Statements

As Webb, Pearson, and Koerbin express, before we make any attempt to preserve a work we need to answer the “why”.  Their decision to write Preservation Intent Statements is a means of accomplishing this. For, as Webb et all say, “[w]ithout it, we are left floundering between assumptions that every characteristic of every digital item has to be maintained forever.”

And nobody has the time or resources to save every characteristic of every digital item.  At least I don’t.  To try and do this would be impossible and even undesirable for certain works, where the original hardware and software become too costly to maintain.

This leads to a discussion of authenticity. Like Espenshied points out in regards to preserving GeoCities, with increased authenticity comes a lower level of access, but with a low barrier to access comes a low level of authenticity and higher percentage of lossy-ness. In the case of GeoCities, Espenshied says,

“While restoration work must be done on the right end of the scale to provide a very authentic re-creation of the web’s past, it is just as important to work on every point of the scale in between to allow the broadest possible audience to experience the most authentic re-enactment of Geocities that is comfortable for consumption on many levels of expertise and interest.”

And that gets at the heart of why we should bother to create Preservation Intent Statements before implementing any actual preservation actions.  We need to establish the “bigger picture,” the long-term vision of a particular work’s value.  Rowell also points out that there are different kinds of authenticity: forensic, archival, and cultural.  Forensic and archival authenticity deal with ensuring the object preserved is what it claims to be (if you’ve read Matt Kirschenbaum’s book Mechanisms, you know that this can be harder than you think to achieve).  Cultural authenticity, however, becomes a much more complex issue, and explores how to give respect to the original context of the work while still ensuring a wide level of access.

And once we have decided on the best strategy, we then get into…

The “what” and the “how”: Significant Properties Characteristics

Now that we’ve established the “bigger picture,” we get into the details of exactly how to capture the work for preservation.  This is where Dappert and Farquhar come back in.  Dappert and Farquhar really get technical about the differences between “significant properties” and “significant characteristics.”  Their definition of significant characteristics goes like this:

“Requirements in a specific context, represented as constraints, expressing a combination of characteristics of preservation objects or environments that must be preserved or attained in order to ensure the continued accessibility, usability, and meaning of preservation objects, and their capacity to be accepted as evidence of what they purport to record.”

Sounds confusing, right? The way I understood it was that properties can be thought of like HTML properties for coding.  In coding, properties are simply a means of using a logical system language to define certain attributes of the website/game/whatever we are coding.  Similarly, for a digital work, the property itself is abstract, like “fileSize” or “isVirusScanned.”  We aren’t trying to preserve those properties; rather, it is the pair of the property with its value (like “fileSize=1MB”) that we want to capture, and this is what a characteristic of the work is.  You wouldn’t save a property without its value, nor would you save the value without attaching it to a property.  And significant characteristics go beyond the basic forensic/archival description of the object by capturing the context surrounding the object.  Thus, significant characteristics can evolve and change beyond the original work as the preservation environment changes and as different courses of action are taken.  And all of these changes should be documented along the way through these significant characteristics, prioritized and listed by order of importance.

The last question that remains is… is anyone else’s mind boggled by all this?

There’s an App for That, But Why?

Stories from Main Street and The Will to Adorn are projects created by the Smithsonian Institution that are very different in subject matter and in execution but which share the element of encouraging members of particular marginalized groups to contribute their own stories to the endeavors. Both projects have websites and accompanying apps for mobile devices, and it is in these mobile apps where the two projects are most similar.

The Stories from Main Street website and app are offshoots of the Smithsonian

Stories from Main Street Website
Stories from Main Street Website

Institution Traveling Exhibition Service’s Museum on Main Street (MoMS) program. MoMS works to bring the Smithsonian’s traveling exhibitions to cultural institutions serving the small towns (defined as having an average population of 8,000 people) of rural America. The Smithsonian staff envisions that their programs help to bring together the residents of such towns to share their stories with each other, fostering community pride. The MoMS website allows people from anywhere in the country to contribute photos, videos, audio recordings, and written stories pertaining to their experiences in rural America and to experience the content contributed by participants.

The Will to Adorn project, begun in 2010 by the Smithsonian Center for Folklife and Cultural Heritage, “explores the diversity of African American identities as expressed through the cultural aesthetics and traditional arts of the body, dress and adornment.” The project appears to have culminated with an exhibit, demonstrations, workshops, performances, hands-on visitor participation activities, and daily fashion shows at the 2013 Smithsonian Folklife Festival held on the Mall in Washington, D.C. The website seeks to provide an explanation of the questions and goals addressed by the project and provides some sample photo and video content, but it does not offer a means of exploring the full content of the project.

Will to Adorn Website
Will to Adorn Website

While both websites are rather celebratory in the sense of bringing to prominence topics that have generally been excluded from mainstream historical and cultural practice, the projects and websites are very different in tone. Unlike Stories from Main Street, Will to Adorn projects itself as a scholarly endeavor, with researchers actively seeking to distill meaning from the evidence that they gather through the project. Whereas I did not find any user participatory element on the Will to Adorn website, collecting user content and allowing site visitors to explore it is the raison d’etre for Stories from Main Street, which to me has a very haphazard feel to it. Specific geographic location at the level of the town is also an important aspect of the Stories from Main Street content whereas local geography does not appear to figure significantly into the Will to Adorn website.

Main Screens of Both Apps Compared
Main Screens of Both Apps Compared

Despite the stark differences between the two websites, the mobile apps for these projects are actually quite similar. Both apps allow the user to record their own stories related to the topic of the project and also to listen to stories that other people have contributed. Aside from imagery, presentation-wise, the apps are pretty much identical. The Stories from Main Street app was built using Roundware, which bills itself as “an open-source, participatory, location-aware audio platform” that does pretty much exactly what both of these apps do in term of recording audio, being able to add some metadata, uploading content, and being able to select, to a certain extent, the content that will be streamed to the listener. Will to Adorn most certainly was also built using Roundware, but I did not see a credit for it in the app.

Recording content to contribute is (theoretically) easy with these apps. Start by pressing the “Add a Story” button on the main page. On Stories from Main Street, you then have a choice of six general topics from which to choose- Life in Your Community, Sports – Hometown Teams, Music – New Harmonies, Food – Key Ingredients, Work – The Way We Worked, and Travel – Journey Stories. You then identify yourself as a man, woman, boy, or girl, and finally you are asked to choose one specific question (from a provided list of four to six questions) about the subject you selected. Doing so brings you to the recording page, where your question is displayed for you at the top. When you’re ready, press the record button (I recommend the large button at the bottom; I had trouble with the smaller buttons in the middle of the page) and there will be a three second countdown. Then you will have a minute and a half to discuss your chosen question. When you’re done, press stop, and you will then have the option of listening to what you recorded, rerecording it, and uploading it (or you can exit the recording section without posting by hitting the cancel button at the top of the screen, which takes you back out to the main menu).

Stories from Main Street - Screens to Add Story
Stories from Main Street – Screens to Add Story

I chickened out at the point of actually uploading content. I’m not from a small town, and although I did record an answer to one of the Travel section questions, I was afraid of sounding like I was an Easterner mocking something from Midwestern culture that I don’t understand. I gather that the app uses your phone’s GPS to attach location information to your recording when you upload it, which is curious, because geography is such an important part of the Stories from Main Street website and a person may be inspired to record something about their town while away from home or conversely may wish to talk about a small town they’ve visited before from the comfort of their own home, which means the content may have an inaccurate geolocation if it’s based solely on the location of the phone at the moment of recording. On the website, users are able to type in the appropriate location for their content.

Will to Adorn - Some Metadata Choices
Will to Adorn – Some Metadata Choices

Will to Adorn works similarly to Stories from Main Street, although the metadata Will to Adorn collects is a bit more nuanced. After pressing “Add a Story,” the app asks for your age (15-19, 60+, and then each decade in between). They ask for gender, but in addition to the expected male and female, there are also options for “trans” (with an asterisk that goes unexplained) and “other” (which could mean all sorts of things). You then select from one of six broad geographic areas (Alaska and Hawaii I guess have to content themselves with being from the West). Will to Adorn only gives you the choice of a total of five questions to answer. However, and this is kind of key, once I made all of these selections, the screen looked like it was going to send me to a recording screen similar to Stories from Main Street. Nope.

Will to Adorn Recording Screen - Um... Do not get your eyes checked.  The screen is indeed all black.
Will to Adorn Recording Screen – Um… Do not get your eyes checked. The screen is indeed all black.

Black empty screen of doom. I have to presume that the app was tested before it was released, so maybe it’s just not compatible with my iPhone 6, because not being able to record on an app whose whole purpose is to be able to record is rather a problem. And I was more willing to answer and submit to this site (“What are you wearing?” seems like a mostly harmless question). At any rate, images on the Will to Adorn website show recording pages nearly identical to those in the Stories from Main Street app, although you may get up to two minutes to discuss your clothing choices. Website text also indicates that you can attach photos to your story submission too, but the app does not show user images anywhere, and I did not see on the website either the archive of user submissions or a way to record and upload stories so I cannot verify this aspect of the app’s functionality.

In terms of the listening aspect of these apps, after pressing the “Listen” button on the main page and waiting for what seemed like a rather long period of time in both apps for content to load, the app will start playing recordings from the collection. Stories from Main Street defaults to the recordings in the “Life In Your Community” section. Users can flag the content, like it, or if you’re inspired to record your own story, there’s a record button there too.

Listening Screens - Both Apps
Listening Screens – Both Apps

The user does have the option to choose to a certain extent which stories they will hear on the app. On Stories from Main Street, the “Modify” button at the top of the screen allows you to select one of the six content areas and to further narrow down by what specific question(s) you want to hear about. The “Refine” button in the same spot on Will to Adorn allows the user to narrow by age, gender, region, and specific question. No audio played for the first two questions that I selected to listen to on Stories from Main Street, so perhaps no one has actually contributed stories related to those particular topics, but I did have success on my third try. Interestingly enough, on the sports section, there were more question options to listen to than there were to record on your own. And in the Travel section, the “favorite journey” answers were mostly about going to a large city rather than a small town.

I’m not sure that anyone’s actively curating the user responses. There was a recording on the Stories from Main Street app that I heard where some kids were messing around doing a recording and one of them used a slur. In another one, a young man discussed how he and his friends as teenagers would go to the river, drink moonshine, get high, and watch alligators. One snippet was simply “[town name] sucks.” And a recording I heard on Will to Adorn started out as a heartfelt commentary about a certain style of dress but then suddenly turned into a profanity-laden tirade on the subject. I’m not sure if it’s a matter of not wanting to censor what people say or if the Smithsonian is just relying on the community to use that flag button to police the content. There also doesn’t actually appear to be very much content to curate on either site. According to the Stories from Main Street website, there are 519 contributions in the archive. Will to Adorn appears to have far fewer stories than that, as I heard much of the content at least twice while listening to the stories.

While some of the stories contained in the Stories from Main Street and Will to Adorn archives are genuinely interesting, honestly, I really don’t get the point with either of these apps. The stories are snippets of two minutes or less that are for the most part divorced of context. Neither app displays any metadata about the audio that’s playing, so if particular facts are known about the contributor of a recording, the listener won’t have that information. And the contributors don’t always give you much information in their recordings. For example, if a person opens up their recording in Stories from Main Street with “In my town…,” well, which town? How would I know that if the subject doesn’t actually say it in their recording? Assuming the geolocation attached to the recording is correct (an issue with Stories from Main Street that I discussed earlier), the listener doesn’t know what it is and doesn’t have a great way of determining if the speaker is talking about life in Boise, Birmingham, or Burlington (and Wikipedia tells me that there is a Burlington in 24 of the U.S. states!).  Maybe I’m missing the forest for the trees, but I’m a details kind of person.

Many of the recordings on Will to Adorn sound like they were made at the Folklife Festival, and the participants there were generally asked by volunteers about their name, age, and location and were sometimes asked to elaborate on their responses. But the following is the extent of one non-Folklife Festival story on Will to Adorn: “How I feel when I have it on—it makes me feel beautiful.” Have WHAT on? Disembodied from all context, this particular snippet doesn’t seem to me to add much to the conversation about creating meaning and forging identity through one’s attire.

Another interesting context issue with Will to Adorn concerns race. The project as explained on the Will to Adorn website specifically concerns how African Americans express themselves through dress and other adornment. The app invites anyone to contribute their story, which is perfectly fine. But the app does not provide a way to self-identify by race or ethnic/cultural background unless you choose to speak to that issue in your recording. So I guess I don’t understand how any user contributions added to the project’s database from the app could be marshaled as evidence for the original conception of the project.

Context for these stories aside, I also just don’t understand not why “there’s an app for that” but rather why the public would download either of these apps and use them over and over again. Sure, one’s smartphone provides a really convenient way to record very short stories, but I don’t really see much of a reason for an individual to do this more than once or twice. There is no essential tie to a physical place for either of these apps that would prompt a user to open up the app and learn something about that location through the project’s content. There could have been on Stories from Main Street, but there’s no way on the app to search for a particular location to find content related to a place where you happen to be or might be interested in knowing more about. Stories from Main Street does provide a link to the project’s website on the main page (Will to Adorn does not) where visitors can search for audio on a map. Similarly, given the limited amount of content in these collections, I’m not sure why anyone would use the listen function on either app more than a couple times, particularly on Will to Adorn. I’m not saying that the effort to collect and share people’s thoughts on these apps is uninteresting and completely devoid of value, I’m just struggling to see why someone might keep these apps on their phones and use them more than a very few times.

What do you think? How might these apps be improved to increase their current interest and/or enduring value? Without a great deal of context, what can we learn about the subject matter of the projects by listening to these recording snippets?

 

 

A Glass Case of Emotion: User Movitivation in Crowdsourcing

The web is inherently made up of networks and interactions among its users. But what is the nature of these interactions – participatory? collaborative? exploitative? These questions play out when cultural heritage institutions take to the web and attempt to engage the vast public audience that is now accessible to them. Crowdsourcing is a means to allow everyday citizens to participate and become more involved with historic materials than ever before. Similarly, these volunteer projects can overcome institutional monetary and time constraints to create products not possible otherwise. What most interested me in the readings is the motivations of those involved in these projects. Why do citizens choose to participate? Why are institutions putting these projects out there? How do they play on the motivations of their users? These questions link back to the overarching general ideas about the nature of interactions on the web.

Why Wasn’t I Consulted?

Paul Ford describes the fundamental nature of the web with the phrase “Why wasn’t I consulted” or WWIC for short. Ford claims that feedback and voice on content is what the web is run on. By giving people a voice, even through the basest form of expression in likes, favorites, +1’s, or “the digital equivalent of a grunt,” users are satisfied that they were consulted and that they can give their approval or disapproval.

User experience, in Ford’s mind, is centered on their emotional need to be consulted. Additionally, the expression of approval is what feeds other users to create content, receiving a positive emotional response from those who consume their work. Organizations create spaces that shrink the vast web down into communities where the WWIC problem can be solved. Essentially, these structures create a glass case of emotion.

Ron Burgundy in a Phone Booth

Libraries, archives, and museums have to deal with users’ emotions when creating their crowdsourcing ventures. How do we create places where the users will feel consulted and desire to participate? Like Ford, Causer & Wallace in describing the Transcribe Bentham project of University College London, and the Frankle article on the Children of Lodz Ghetto project at the United States Holocaust Memorial Museum, emphasize understanding users and volunteers as well as finding the appropriate medium is important in these undertakings.

Causer & Wallace identify a much more detailed set of motivations of their user groups than Ford’s WWIC idea. Many of their participants claimed they had interests in the project such as history, philosophy, Bentham, or crowdsourcing in general. Other than these categories, the next biggest reasoning for joining the project was a desire to be a part of something collaborative. The creators of Transcription Bentham failed to create an atmosphere where users felt comfortable collaborating which may have been why the project decreased in popularity over time. The Children of Lodz Ghetto project, on the other hand, is much more collaborative with administrators guiding researchers through each step of the process. Eventually they hope to have advanced users take over the role of teaching newcomers. The Holocaust Museum’s project is a much more sustainable model that could lead to lasting success.

Crowdsourcing (For Members Only)

While collaboration and having an interesting topic is a key factor in motivating participation, how do online history sites get the attention of the public to join in the first place? The push for the openness of both the internet and cultural institutions is something I greatly support, but I think motivating the populace to get involved in these projects needs a return to exclusivity. There is still a prevailing notion that archives and other cultural organizations are closed spaces that only certain people can access. In many European institutions this is still the case. Why don’t we use the popular notions of exclusivity to our own benefit?

Hear me out. What these articles lacked was the idea that many people desire what they cannot get or what only few can. I’m not advocating putting collections behind a paywall or keeping collections from being freely available online. Instead, I think participation in crowdsourcing projects should be competitive or exclusive in order to gain the initial excitement needed to gain a following and spur desire for inclusion.

Other social media platforms such as early Facebook and more recently Ello or new devices such as Google’s Google Glass, have made membership or ownership limited, creating enormous desire for each. In these examples, the majority of the populace is asking why wasn’t I consulted? and therefore want to be included. Thus, having the initial rounds of participation be limited to a first-come, first-serve, invite-only platform would spark desire for the prestige of being the few to have access to the project.

In Edson’s article, he wrote about the vast stretches of the internet that cultural institutions do not engage, what he called “dark matter.” While there are huge numbers of people out there who are “starving for authenticity, ideas, and meaning,” I think the first step should be creating a desire to participate and then growing the project. Without something to catch the public’s attention, create a community, and grow an emotional desire to participate, another crowdsourcing website would simply be white noise to the large number of internet users in the world.  The users, who are visiting the websites looking for a way into the projects but denied, could discover the free and open collections which are there right now. After this first limited period, once the attention is there, I think scaling up would be easier. Of course these ideas will only work if the institution has created a place that understands the emotional needs of its users and provides a collaborative and social environment where users are comfortable participating.