|David Giaretta with his book.|
Inevitably, things do get technical in the course of the book; after all, if we did not have technical problems we would not have a digital preservation problem, but the not-too-technical-reader is always warned in good time that this perhaps is a section or a chapter to be skipped. Yet the essence of Giaretta's theory is worth noting for everybody. In his view, migration and emulation, our most well-known preservation strategies, are perhaps good enough for simple objects (PDFs, tiffs, jpegs), but are inadequate for many complex objects which can be found among research data, Giaretta's main focus (hence the title Advanced Digital Preservation). No-one will doubt that scientific research often generates very difficult objects to preserve - they are complex, dynamic, often non-renderable, and so forth.
If you do not preserve research data, this book is still important for you, because other sectors (cultural heritage, archives) that started out with simple objects will increasingly be faced with more complex varieties, as content producers are discovering the extra possibilities and putting them to good use.
|The 'Droste' effect|
To tackle the problems of more complex objects, Giaretta, and the CASPAR project team, developed a theory around the Representation Information Network. Simply put: a (or rather: any) data object is nothing but ones and zeros; they must be accompanied by representation information in the metadata to tell you what you need to 'independently interpret, understand and use' (in OAIS language) the data object. The data object can be a single file or multiple files, and the representation information can be anything from a scribbled handwritten note to a complex machine readable formal description (pp 17 ff). In Giaretta's more accessible advocacy language: you have something that is unfamiliar (ones and zeros) and the representation information gives you what you need to make it familiar. However, representation information is not a straight-forward thing: it is more like a set of Russian babushka dolls (in Dutch we would refer to the 'Droste effect', after the cacao nurse that serves from a cacao tin that has her own image on it which serves from a cacao tin that ...): a Word document cannot be understood with Microsoft Office software alone, you will need the operating system, and the programming language, and so forth and so forth. You will need every dictionary, every definition, every standard, every specification that is used somewhere along the line - until you connect with the knowledge base of your designated community, that is: you make the connection with what your designated community has at its disposal in terms of software, hardware and knowledge to work with those.
Over time, as technology evolves, the 'unfamiliarity' of a digital object will increase and the the amount of representation information needed to connect with your designated community will increase with it. Our job is to manage that process and make sure there is always enough representation information to connect with our users. Preferably in an automated way, because there is no way we can do this manually (unless of course we have an truly endless flow of money ...).
Giaretta and his CASPAR team argue that this is the only method that will work for all digital objects, no matter how simple or complicated. The trick will of course be to build that automated process that will keep our digital objects "fresh".
More research is needed to turn this theory into something practical. Meanwhile there is this book to enjoy and learn from, including excursions into non-technical territory: repository audits, preservation chains, business models, stakeholders analysis, and more. Giaretta's fluid style of writing, the many cross-references, summaries, and warning signs have enabled me to delve deeper into the technical level than I thought possible. And I am still learning.
What I would like to see next, however, is more interaction between what Giaretta is developing and what the Open Planets Foundation led by Bram van der Werf (and the related SCAPE project) is working on. What would be really great to have for the community is their joint views on what works and what does not - and in which circumstances, and the direction R&D should take. How about it, gentlemen?
David Giaretta [et al.], Advanced Digital Preservation (Springer, 2011, isbn 978-3-642-16808-6, €99.95).
this book looks like it be the next big text book for digital preservation classes....will have to give it a read through. I loved the photo of this man in your other post where he is standing in front of his slide. Its almost psychedelic.
I have heard some Digital Archivists saying: "We need more heterogenity in terms of solutions."
I guess that solution/idea described in his book is one step into the right direction,
Een reactie posten