Software Preservation and the Special Problem of Unix?

The Library of Congress’ Preservation.exe meeting was last week… two days of fascinating presentations on the challenges of preserving historical software, digital media, and computing environments. I followed the #presoft hashtag (the meeting had no website, so twitter was it), and saw in the whole program not a single mention of Unix.

Is it, I wondered, because Unix does not need to be preserved–at least not in the same way as old MSDOS applications, or old Amiga OS disks, or a favourite videogame for the Apple II?

Unix, unlike these systems, is still running. Its history is immanent. Unix rose to dominance in computer research back in the 1970s, weathered a decade or two of the emergence of the computer industry as we know it today, and went through a renaissance with the advent of the Internet, the Web, and the free software movement. Today, Unix and Unix-derived systems are everywhere. It runs most of the Internet. It’s in your MacBook. It’s in your iPhone. And your Android. And your Blackberry. It’s probably in your car, and your microwave oven too. Furthermore, Unix provides the key architectural ideas of modern computing: the very idea of files, streams, network interfaces were all either invented or perfected in Unix. Even Microsoft, which positioned itself as the anti-Unix for decades, is hugely influenced by its architecture.

In contemporary Linux and Mac OS X systems, you can directly see fragments and traces of Unix that go back to the early 1970s. In the source code alone, strictly speaking, there are fragments of BSD Unix that go back to the 1980s. And, because of the existence of open interoperability standards, when the code itself has changed (due to license issues), the way the software works and presents itself (and its documentation) remains a continuous flow since the 80s, with its roots way back in the 1970s. Unix is not so much like a text as it is like a discourse.

Well, consider this:

Unix, by contrast, is not so much a product as it is a painstakingly compiled oral history of the hacker subculture. It is our Gilgamesh epic.

The rejoinder is by Neal Stephenson, not at the #presoft meeting, but rather in his 1999 essay, “In the Beginning was the Command Line.”

Stephenson went on:

What made old epics like Gilgamesh so powerful and so long-lived was that they were living bodies of narrative that many people knew by heart, and told over and over again–making their own personal embellishments whenever it struck their fancy. The bad embellishments were shouted down, the good ones picked up by others, polished, improved, and, over time, incorporated into the story. Likewise, Unix is known, loved, and understood by so many hackers that it can be re-created from scratch whenever someone needs it. This is very difficult to understand for people who are accustomed to thinking of OSes as things that absolutely have to be bought.

Unix is self-archiving. It is not an object or an artifact so much as an ongoing discourse, a culture. It preserves itself by being continually re-implemented and re-created. The important parts of Unix are not the lines of code–certainly not, for the licensing issues prevent this from being a practical reality–but the system of interconnected ideas and the universe of tools and systems built with it.

Contrast this with the kind of software preservation that #presoft addressed. If archiving, like the writing of history, only makes sense when the objects of study are in some sense done, or complete, or finished, then the study of software history is already compromised. It brings to mind the old story of the man who searched for his keys under the lamppost because the light was better there. It is methodologically much simpler to perserve—and study—the pieces that present themselves neatly as artifacts, by being packaged and sold, shink-wrapped.

But I would argue that the mainstream of living software history—not exclusively the Unix tradition by any means, though Unix is a good example of this—does not lend itself to traditional archival approaches, because not only has it evolved over a long period of time, it continues to do so, into the future. It has no boundaries, no edges, and therefore eludes being captured, defined, perserved.

There’s a certain irony in contemporary projects to build emulation environments that will run old software. Clearly being able to run and interact with old software is vastly better than merely collecting the bits (or the floppy disks). But in creating the emulation environment—in all likelihood running on Linux or some other Unix variant, and connected to the Internet (because what good would it be if it weren’t)—you gain a vastly better infrastructure than the software had originally. In the emulator, you give that old software life… and a network connection, the lack which is, to be blunt, why it died in the first place.

So I fear that the preservation projects end up like a Cabinet of Curiosities, filled with interesting things that aren’t necessarily well connected to one another or anything else. Does this invalidate or fatally question the software preservation project? No, it doesn’t. But it does, I think, put it in brackets… it shows that this particular history is not connected, except indirectly, to the larger currents of time. Because there is no network. Bruno Latour wrote, “747s do not fly; airlines fly.” It’s the same critique.

I don’t want to dump on the software preservationists. I appreciate and share their impulse. But I do think that there’s a methdological problem in preserving only the easily preservable, in treating software “titles” much in the same way we treat books and the documentary record: as so many discrete items, whose provenance is more or less unproblematic. While all software artifacts have history and historicity, Unix is on some level nothing but its historicity. I would go farther and suggest that in the age of the network, Unix is the model, not the exception. Software history needs ethnography more than it does archival preservation.