Monday, 30 January 2012

Digital Preservation: What I Wish I Knew Before I Started

Tuesday 24th January, 2012

Last week I attended a student conference, hosted by the Digital Preservation Coalition, on what digital preservation professionals wished they had known before they started. The event covered a great deal of the challenges faced by those involved in digital preservation, and the skills required to deal with these challenges.

The similarities between traditional archiving and digital preservation were highlighted at the beginning of the afternoon, when Sarah Higgins translated terms from the OAIS model into more traditional ‘archive speak’. Dave Thompson also emphasized this connection, arguing that digital data “is just a new kind of paper”, and that trained archivists already have 85-90% of the skills needed for digital preservation.

Digital preservation was shown to be a human rather than a technical challenge. Adrian Brown argued that much of the preservation process (the "boring stuff") can be automated. Dave Thompson stated that many of the technical issues of digital preservation, such as migration, have been solved, and that the challenge we now face is to retain the context and significance of the data. The point made throughout the afternoon was that you don’t need to be a computer expert in order to carry out effective digital preservation.

The urgency of intervention was another key lesson for the afternoon. As William Kilbride put it; digital preservation won’t do itself, won’t go away, and we shouldn't wait for perfection before we begin to act. Access to data in the future is not guaranteed without input now, and digital data is particularly intolerant to gaps in preservation. Andrew Fetherstone added to this argument, noting that doing something is (usually) better than doing nothing, and that even if you are not in a position to carry out the whole preservation process, it is better to follow the guidelines as far as you can, rather than wait and create a backlog.

The scale of digital preservation was another point illustrated throughout the afternoon. William Kilbride suggested that the days of manual processing are over, due to the sheer amount of digital data being created (estimated to reach 35ZB by 2020!). He argued that the ability to process this data is more important to the future of digital preservation than the risks of obsolescence. The impossibility of preserving all of this data was illustrated by Helen Hockx-Yu, who offered the statistic the the UK Web Archive and National Archives Web Archive combined have archived less than 1% of UK websites. Adrian Brown also pointed out that as we move towards dynamic, individualised content on the web, we must decide exactly what the information is that we are trying to preserve. During the Q&A session, it was argued that the scale of digital data means that we have to accept that we can’t preserve everything, that not everything needs to be preserved, and that there will be data loss.

The importance of collaboration was another theme which was repeated by many speakers. Collaboration between institutions on a local, national and even international level was encouraged, as by sharing solutions to problems and implementing common standards we can make the task of digital preservation easier.

This is only a selection of the points covered in a very engaging afternoon of discussion. Overall, the event showed that, despite the scale of the task, digital preservation needn't be a frightening prospect, as archivists already have many of the necessary skills.

The DPC have uploaded the slides used during the event, and the event was also live-tweeted, using the hashtag #dpc_wiwik, if you are interested
in finding out more.