9th – 11th February 2010, London

By Matthew Herring

I recently attended the first day of this three-day course. It was a very informative day, which introduced the field of digital preservation and the Planets project.  Planets (Preservation and Long-term Access through NETworked Services) is a suite of services, software and methodologies for the long term preservation of digital assets. It addresses what it terms ‘logical preservation’, as opposed to bit-stream preservation. Bit -stream preservation concerns the obsolescence of storage media and the corruption of digital files through storage media degradation, but logical preservation is about ensuring that digital files can be read and interpreted in the future.  To quote Ross King, who delivered the keynote address: “Logical preservation addresses the problem of accessing bitstreams, whose interpretation may depend on obsolete operating systems, applications, or formats”. A major focus of the project is, therefore, on migrating content from one file format to another, just as you would migrate the bit-streams of data from one physical carrier to another, to avoid obsolescence. A number of important questions arise surrounding this. How to you decide when you need to do this? What formats should you convert to? How do you know that the transformed file hasn’t lost features which are important to its future interpretation/use? How do you know which features are important to preserve? What software should you use in order to do the transfer? What about automating the process to handle large volumes of files? What about recording what you’ve done? Planets provides a series of tools to help answer those questions. A second approach to the problem is emulation, where, instead of converting the file format, obsolete hardware/operating systems are simulated as an environment in which your files can be run in their original versions. Planets also provides support for this approach.

The Planets suite of tools includes:

  • Plato – web-based preservation planning software tool.
  • XCDL and XCEL two XML languages for, respectively, describing the properties of digital objects and extracting properties from objects
  • Planets Core Registry, containing descriptions of file formats, preservation software products and information about their suitability to preserve particular content
  • Emulation  tools accessed via GRATE (Global Access to Emulation Services)
  • SIARD open-source format for relational databases and suite of tools for converting to/from SIARD
  • Testbed – web-based application that provides an environment for testing possible preservation actions
  • Corpora – a collection of annotated data in the Testbed, which can be used to run tests on
  • Planets Framework – web-based service which unites other Planets tools into a customisable suite

An in-depth description of the Planets services and tools can be found here.

For me personally, I found the day a useful introduction to the concept and issues surrounding logical preservation. The Planets project is certainly very interesting and the tools it provides look very useful. However, I gained most from the wider discussions around the area of preservation. One of the wider issues which emerged from the various talks and discussions was that of risk versus opportunity. Ross King (Austrian Institute of Technology) and Clive Billenness (Programme Manager for the Planets Project) talked about how difficult it is to get the senior management of organisations to back preservation plans unless you couch it in the language of risk. Parts of their talks were thus given to describing the Planets approach through the methodology of risk management. Some of the delegates felt uneasy with this on a philosophical level – shouldn’t preservation be seen as an opportunity to pass information down to future generations, not as merely a way of avoiding potential future ( and financial) loss? Clive Billenness concurred that, speaking personally, this was not the approach that he would favour, but that experience had shown that this was the only way of getting senior managers to take the area seriously. William Kilbride (Executive Director of the Digital Preservation Coalition and not a member of the Planets team) used his slot to present a contrasting motive for preserving content, under the rubric of “We don’t do it for the digits!” He stressed the social and cultural usefulness of preservation and reminded us that the point was to preserve the experience of whatever resource we are preserving, not necessarily the exact bit-stream (which would be altered by format migration).

Another question which emerged was that of the sheer volume of digital material which is currently being produced, and will be produced in the foreseeable future. Various figures were shown to demonstrate the mind-boggling amount of data out there. William Kilbride used an analogy from his work as an archaeologist to make the point that we don’t necessarily want to preserve absolutely everything: he had refused to store several tons of excavated pottery fragments because the cost of storing and conserving it outweighed its value. Similarly, we need some way of sifting what is worth saving. This is difficult, as it implicitly involves making predictions about what future generations will find interesting about our time – most of what archaeologists recover now is stuff which previous generations would not have thought to preserve for us.

Finally, a questioner in one of the Q and A sessions raised the issue of the human intelligibility of our data in the future. In other words, some types of data (the questioner was thinking of data from the aerospace industry) need specialist knowledge in order to understand them. How do we preserve that knowledge alongside the data and the means to view the data?