Friday 27 August 2010

Homes for old software

The more hybrid archives we work with, the more obvious it becomes that we need access to repositories of older software (or 'abandonware'). For older formats you often find that not only is the creating software obsolete, but any migration tool you can dig up is pretty out-of-date too. Recently I used oldversion.com to source older versions of CompuServe and Eudora to transform an old CompuServe account to mbox format with CS2Eudora. The oldversion site is really valuable and we could use more like it, and more in it. The trouble is, collecting and publishing proprietary 'abandonware' seems to be a bit of a grey area.

In 2003, the Internet Archive obtained some exemptions from the Digital Millennium Copyright Act (DCMA) that has allowed them to archive software, but this has to be done privately with the software being made available after copyright expiry. Not much help now, but promising for the long-term. The best thing that could happen (from an archivist's point of view) is that individuals and companies formally rescinded their interests in older software and put them in the public domain. Ideally they would put an expiry date into the initial licence before the software becomes abandonware.

I'm curious to hear about other good abandonware sites, especially ones that include 'productivity software' (our focus is here rather than gaming!). The Macintosh Garden is a good one, and Apple themselves also provide access to some older software, like ClarisWorks. What else is out there that we should know about?

Tuesday 17 August 2010

Balisage 2010 The Markup Conference

Balisage 2010 The Markup Conference was
preceded by the International Symposium on XML for the Long Haul Issues in the Long-term Preservation of XML which opened with:

A brief history of markup of social science data: from punched cards to “the life cycle” approach covering the “25-year process of historical evolution leading to DDI, the Data Documentation Initiative, which unites several levels of metadata in one emerging standard.”

Sustainability of linguistic resources revisited looked at some of the difficulties facing language resources over the long-term.

Report from the field: PubMed Central, an XML-based archive of life science journal articles provided insight into the processes deployed to give public access to the full text of more than two million articles.

Portico: A case study in the use of XML for the long-term preservation of digital artifacts discussed some practices that can help assure the semantic stability of digital assets.

The Sustainability of the Scholarly Edition in a Digital World explored the need for “ tools to make XML encoding easier, to encourage collaboration, to exploit social media, and to separate transcriptions of texts from the editorial scholarship applied to
them”.

A formal approach to XML semantics: implications for archive standards examined whether “The application of Montague semantics to markup languages may make it possible to distinguish vocabularies that can last from those which will not last”.

Metadata for long term preservation of product data discussed the “valuable lessons to be learned from the library metadata and packaging standards and how they relate to product metadata”.

The day concluded with Beyond eighteen wheels: Considerations in archiving documents represented using the Extensible Markup Language (XML) which contemplated “strategies for extending the useful life of archived documents”.

Sessions in the main conference 2010 – covered topics such as :

gXML, a new approach to cultivating XML trees in Java which proposed “A single unified Java-based API, gXML, can provide a programming platform for all tree models for which a “bridge” has been developed. gXML exploits the Handle/Body design pattern and supports the XQuery Data Model (XDM)”.

Java integration of XQuery — an information unit oriented approach explored “a novel pattern of cooperation between XQuery and Java developer? A new API, XQJPLUS, makes it possible to let XQuery build “information units” collected into “information trays”.

XML pipeline processing in the browser discussed the benefits that providing XProc as a Javascript-based implementation would offer comprehensive client-side portability for XML pipelines specified in XProc.

Where XForms meets the glass: Bridging between data and interaction design explored using XForms which offers a model-view framework for XML whilst working within the conventions of existing Ajax frameworks such as Dojo as a way to bridge differing development approaches,data-centric versus starting from the user interface .

A packaging system for EXPath demonstrated how to adapt conventional ideas of packaging to work well in the EXPath environment. “EXPath provides a framework for collaborative community-based development of extensions to XPath and XPath-based technologies (including XSLT and Xquery)”.

A streaming XSLT processor Michael Kay (editor of the XSLT 2.1 specification) showed how he has been implementing streaming features in his Saxon XSLT processor;

Processing arbitrarily large XML using a persistent DOM covered moving the DOM out of memory and into persistent storage offering another processing option for large documents, by utilising, an efficient binary representation of the XML document that has been developed, with a supporting Java API.

Scripting documents with XQuery: virtual documents in TNTBase presented a virtual-document facility integrated into TNTBase, an XML database with support for versioning. The virtual documents can be edited, and changes to elements in the underlying XML repository are propagated automatically back to the database.

XQuery design patterns illustrated the benefits that might extend from the application of meta design patterns to Xquery.

Monday 9 August 2010

Any old tapes - a true story - part 1

My neighbour is a self-employed architect. He has worked digitally for at least ten years and now most of his work is done on either his old (but still perfectly serviceable) ThinkPad or a shiny new desktop PC. He works with a couple of different CAD packages along with some tax software and MS Office, all on Windows XP.

Recently, knowing what I do for a living, he asked if I could help with a problem he was having retrieving files from an external hard drive and, being easily persuaded by the promise of food and wine, I agreed to try to help (with all the usual caveats about probably not knowing anything about it all!).

We got the disk drive working quickly (this is often the way when solving other people's computer issues. Sit with them and they'll solve it themselves!) and so he asked me about his backups too - which should have been happening regularly to another external drive, but were not. I checked out the drive and found an old directory with a very uninformative name that contained some data files and a few manifests that didn't make much sense. I've forgotten the name already, but he told me this was the name of the backup software. Searching, this software was not on the PC. The new PC had been recently built on the basis of the old one by an outsourced IT support. They'd done a good job restoring the software, etc. but this one backup program (a commercial one) was missing.

The consequences where two-fold:

1) No backup was running
2) the data files (about 1.4GB worth) and manifest were, without the software, entirely unreadable.

My neighbour thought perhaps the backup software was about so he'd ask the IT support to install and configure it. I fired up MS Windows Backup (the first time I've ever used it - it seems OK) and ran a one off backup of his work, just to be on the safe side and suggested he ask his support about that (one thing you must never do is undo or override the work of the real support person!) too - it required a password to add it to Windows scheduler.

After it completed, he astutely asked where the files had gone, and so I showed him, on the external drive and was dismayed to find that Windows Backup had also dumped all the files into a 1.4GB (proprietary?) container. I wondered if we'd ever have to extract files from Windows Backup files and made a mental note to keep a copy of the software (bundled with XP) in the cupboard just in case! Worse, it was then impossible to reassure him that the files were there without a crash course in Windows Restore. Still, I remember MS Backup and Restore being a pain way back to MS-DOS! :-)

As we finished our wine and talked about these things, he seemed to suddenly remember my job and jumped up, rummaged in a cupboard. He pulled out an old tape cartridge:


Once his main backup media, but, like the files on the external drive, no longer usable. This time both the hardware and the software were long gone. He didn't seem worried - the files has probably been migrated off his old machine to the new one at some point - but still he wondered what was on it and said "I don't suppose it is readable now is it?". He hadn't meant it as a challenge, but I couldn't resist! I convinced him to let me take the tape with me and try to recover his data - all in the name of digital archaeology, of course!

My next post will be my first adventures in the land of the Travans...