The original proposal for the World Wide Web, written by Tim Berners-Lee in 1989, is an important piece of internet history. It also can’t be opened on modern computers.
John Graham-Cumming, a British software engineer and writer, attempted to open the Word document containing the proposal. Modern versions of Microsoft Word and Apple’s Pages both utterly failed to open the file, as he outlined in a blog post. The open-source word processor LibreOffice worked, albeit with messy formatting. Graham-Cumming ultimately found a PDF exported by CERN in 1998, which was the only way he was able to see the document as it existed in 1989.
It’s worrying that such an important piece of history, in such a common file format, could be almost completely lost to the passage of time and software updates. Anyone with a collection of old digital documents, photos, and videos might be wondering if the same thing will happen to their files, which is the sort of question digital archivists deal with all the time, it turns out. So I reached out to one.
“Twenty years, in the digital realm, is ancient,” says Lance Stuchell, director of digital preservation services at the University of Michigan. His team is frequently tasked with recovering digital files from old computers and storage mediums. “We have a lab that can deal with old media—floppy drives, CDs, older computers. We can get that off of those types of media and move it into our preservation system while ensuring we don’t mess it up while we’re doing it.”
But getting the files off the drive is just the first step: Then you have to open them, and leave them in a state that will be openable for decades to come. It’s a job that’s given Stuchell a reason to think about strategies for keeping documents around as long as possible. I asked him what those of us who aren’t professional archivists should do to ensure our files last decades.
Use Open Formats
The Word document I mentioned before could no longer be opened by Microsoft Word because the software has changed over time. This is part of the challenge of archiving digital files.
“With physical stuff, the less you look at it the longer it lasts,” Stuchell says. “Digital stuff, we’re constantly fighting with obsoleteness. As the file moves through time, it’s losing information.”
Updates to software like Microsoft Word mean that files that opened fine in the ’80s don’t open in the 2020s. Part of the problem: Microsoft, and only Microsoft, controls the file format, or even knows how it works. For this reason, Stuchell says he encourages people to export files in an open file format—especially files they want to keep accessible for the long term.
For documents he recommends PDF/A, an open standard built on top of Adobe’s PDF format that includes everything the file needs in order to be opened, including the fonts used in the document. Microsoft Office, LibreOffice, and Adobe Acrobat all support exporting to PDF/A, meaning it’s relatively easy to make such a file. Stuchell recommends that you archive any document that you want to keep to that format.