Our cultural gut tells we should start saving software…
You will notice how software preservation is a relatively new endeavor when you encounter a somewhat intuitive rationale that pushes the agenda. Concerning the preservation of Planetary, a visual music player developed for iPad, Sebastian Chan and Aaron Cope in their 2013 “Collecting the Present: Digital Code and Collections” describe their motive as follows:
The benefits accrued by the ability for software and hardware industries to frequently “shed their skin” and start anew still outweigh the costs, and that is the landscape in which museums will continue to try and preserve objects in for the foreseeable future. To that end we see the ability and freedom for third parties to play and experiment with—to become comfortable and familiar with—Planetary’s source code as integral to any efforts to recruit developers in our preservation aims. Will some of what we see be still-born or not in line with the museum’s thinking? Probably. Will they be worth it in the long run? We choose to believe so.
Matthew Kirschenbaum in his 2013 “An Executable Past: The Case for a National Software Registry” published in “Preserving.exe” acknowledges the similar condition that currently surrounds software preservation:
For decades the Library of Congress has also been receiving computer games, and in 2006 the games became part of the collections at the Culpeper campus. But while the Library registers the copyrights, what it means to preserve and restore vintage computer games—or any kind of computer software—is less clear. As yet there is only the beginning of a national agenda for software preservation, and precious little in the way of public awareness of the issue.
So we think we should save softwares, but what exactly should we save?
Granted we take heed of the pace of software obsoletion—and our cultural practices highly shaped by the daily interactions with software—it is not difficult to imagine varying understanding of software preservation. Questions that concern archivists include: To what end do we preserve software? What is the scope of software preservation? Chan and Cope consider the functionality of the software as its utmost importance. Therefore, for Planetary, it was the features of software—“the interaction design and experience of manipulating and affecting a dynamic three dimensional system using a touchable interface”—that need to be saved and reconstructed.
For Kirschenbaum, it is the historical context in which certain software was conceived that demands documentation. With the example of Microsoft Word 2.0 released in 1991, Kirschenbaum entertains how a hidden mischievous features of “WordPerfect monster” alludes to the then ongoing rivalry between the competing word processing applications.
Henry Lowood in his “The Lures of Software Preservation” (also published in “Preserving.exe”) considers an alternative preservation method by suggesting the verification of data files. Lowood’s approach questions the “screen essentialism” (the preservation of the look of software) and encourages the preservation of software integrity by using such signatures as hashes and checksums.
Is the through-er, the better?
Despite the different emphasis on the significant property of software, one thing is for certain when considering software preservation. That is, software needs to be preserved as a whole package. The package may include, just as the case of Planetary, the software’s early versions, change logs, and bug reports. The suggestions made in Erick Kaltman et al.’s 2014 “A Unified Approach to Preserving Cultural Software Objects and Their Development Histories” go as far to include paper prototype, type of IDE, email correspondences, and interpersonal relationship, concerning the documentation related to the development of academic gaming software Prom Week. Mind you, however, while through documentation and preservation of relevant files are essential to saving software, such action calls for professional judgment and managerial strategies. In the words of Chan and Cope:
Because digital works are exist in any equally digital life-support system, or ecosystem, absent preserving the entire dependency chain for a single digital object museums need to be able to conceptualize and articulate a strategy for demonstrating some kind of tangible proof for those objects in their collection which lack the physicality typically associated with our collections.
Keep it open!
Another consensus seen among Chan and Cope and Kaltman et al. is the benefit of preserving software. Chan and Cope followed the original open source policy of Planetary and concludes its advantage for preservation as follows:
The choice to both enable and encourage derivative works of Planetary was, and continues to be, seen as fundamental to our efforts to preserve “Planetary the software system” as opposed to “Planetary the iPad application.” Because of all the complexities inherent in preserving hardware and software systems, we are hoping that third party developers will translate the design and intent of Planetary to other systems, for example Google’s Android mobile operating system or to the JavaScript and WebGL programming environments found in modern web browsers.
Kaltman et al. in a similar vein writes:
Most scientific progress is based on reproduction and extension of others’ work, making that work easier to access and more transparent would open up more research avenues and increase scientific output. […] The more open a platform is, the better a chance it has for migration to other systems, and the more open a development process, the easier it will be for future researchers and students to understand and further the work. (7)
The benefit of open source seems to resonate with the idealistic rationale of library. I wonder what are the potential drawbacks of this approach? If we use this rationale, do you think we can persuade the stakeholders with a growing number of software preservation? Especially, as mentioned earlier by Chan and Cope, when developing a new software is much cheaper than tinkering the existing one? Do we envision something equivalent to “a law library” or “a business library” with the collections of software?
I think that one of the challenges of moving towards open source is the fact that, at the end of the day, people are trying to make a living on their software. Allen and Teuben quoted Ben Balter of GitHub from the Preserving.exe summit as saying that 75% of internet runs on open source, and they cite Facebook and other large websites as being the ones that still maintain some level of closed-source material. I wonder though about smaller websites and creators; when they are small, if their product becomes popular, they risk larger companies coming in and attempting to create analogous pages (somewhat akin to what Google+ attempted to do to Facebook, although Facebook was too big by the time Google’s social network came about).
It seems that the best choices for this would be for companies to keep records about their source code and revisions (something we probably can’t expect), OR for archives to receive the source code early, and release it after a specified amount of time. The latter seems more realistic, but it also means that archives would need to spend more money on security and legal fees (to prevent hacking, and to be prepared for lawsuits should their resources get hacked).
That’s a good point! In fact, companies often do have their own corporate archives, right? A media studies scholar Lisa Nakamura conducted archival research of a Silicon Valley electronics company, Fairchild Semiconductor, based on their archive. The article is called “Indigenous Circuits: Navajo Women and the Racialization of Early Electronic Manufacture” (https://lnakamur.files.wordpress.com/2011/01/indigenous-circuits-nakamura-aq.pdf).
I agree with the route of clearing the legal arrangement, since many parts of digital infrastructure involves proprietary software. I wonder whether there is any law that asks company to disclose their software’s source code after a certain period of time? If not, do you think a law of this kind would be helpful?
I can’t imagine that any law is in place, simply because it would need to indicate who the source code would be released to. I think that perhaps something akin to “public domain” could be established, but I think the biggest problem with such a policy would be that the companies would have to have a copy of their source code to open to the public. So, for instance, if we were able to get Facebook’s source code 25 years from now, would they even have a copy to give? Would “Facebook” even exist? That’s definitely the challenge.
While corporate archives have their own purposes and audiences, their existence could certainly be used to further an open source cause, should archivists and companies be willing to part with their code (and indeed, I would think that twenty years from now, a website like Facebook would be willing to release some of its original code).
The idea of software libraries is an especially interesting because when we were learning HTML for Information Infrastructures, more than one person told me to find the thing I wanted on the W3C school and just copy the code, amending it as needed, instead of trying to figure it out from scratch. With, as pointed out, the growing need to make a living as a programmer and the increased watchfulness, I understand that intellectual property lawsuits about code are in the works and that cavalier attitude will no longer be possible.
But what precisely is being trademarked? The string of code – surely this is an opportunity ripe for jokes about monkeys at keyboards? The final screen – but which one? The experience – but who’s?
I found the copyright registration document issued by The U.S. Copy Right Office in 2012 (http://copyright.gov/circs/circ61.pdf). It says: “A ‘computer program’ is a set of statements or instructions to be used directly or indirectly in a computer in order to bring about a certain result. Copyright protection extends to all the copyrightable expression embodied in the computer program. Copyright protection is not available for ideas, program logic, algorithms, systems, methods, concepts, or layouts.” The logic reminds me of the FRBR approach…
Also, concerning the trademarking practice, an article on “MacTech: The Journal of Apple Technology” has this to say: “A trademark is either a word, phrase, symbol or design, or combination of words, phrases, symbols or designs, which identifies and distinguishes the source of the goods or services of one party from those of others” (http://www.mactech.com/articles/mactech/Vol.12/12.10/TrademarkIssues/index.html). One of the benefits of acquiring a trademark is euphoniously stated as to avoid confusion among consumers among similar goods or services. What might be of our interest, perhaps, is that registration to a trade mark (as far as my understanding goes) would mean registration to a public record (http://www.uspto.gov/trademarks-application-process/faqs-personal-information-trademark-records). While it would still most likely involve some form of legal consultation to ask for a full disclosure of source code, trade mark registration would let us know whom to contact–the benefit of such information was covered by Kaltman et al.’s “A Unified Approach to Preserving Cultural Software Objects and Their Development Histories.”
Written code is copyrightable, as we’ve already seen. Trademark would not apply here, it is for a short phrase at most. So technically the code has been submitted to the Copyright Office, as required in order to establish copyright date if any legal action is to follow – you can copyright something without sending it to the Copyright Office, but you won’t have their official backing in court if you don’t send it to them.