The second day of the Goportis Digital Preservation Summit was all about ingest, or 'receiving' content in your repository or archive. In his keynote, Seamus Ross, formerly of the digital preservation taskforce at Glasgow and now at the University of Toronto, was quick to dispell any illusions that 'receiving' is an easy thing to do.
Interoperability troubles – from Open Office to Powerpoint – causing stress before Ross’s presentation (Nina Stoffers, left, Ross right).
Ross’s presentation was a complete Ingest 101 course, and so, I will let his slides tell most of the story.
Ingest is about “receiving” content from producers:
Ideally, we would want to create a work flow that is consistent, error-free, well-documented, in accordance with our organization’s policies:
Preferably, you know who the producers of your content are and you start negotiating with them so that they deliver the best possible quality. However, keep in mind that whatever makes our lives easier, is most likely to make the producer’s lives more difficult. That is where the bargaining begins. Ideally, you get this:
But in practice, this is most likely what you get most of the time:
Seamus Ross: ‘Most of the work we do during ingest is about fixing all these errors, is about compensating for the communication failures between producers and archives.’ So, what do we do?
How do we do all this? Ross: “You are a craftsman. You must accept that your tools are blunt.” Present tools for identification and validation are far from perfect. They still require a lot of manual work and the people who work with them must be very knowledgeable. Also, “You may be sure that producers will deliver error-laden stuff, no matter how well you train them.”
Ross stressed that policies are an essential part of the equation:
But even policies cannot guarantee smooth sailing:
Having said that, Ross did have a list of useful reference material for the audience, including an instructive case study at http://artefactual.com/wiki/index.php. Check out his slides when the complete set comes available via the event website. He also mentioned the useful NDIIPP tools and services directory at http://www.digitalpreservation.gov/partners/resources/tools/index.html and the Cairo Tools Survey. But remember Ross’s warning that working with these tools requires quite a bit of prior knowledge.
During the Q&A Adam Farquhar of the British Library offered his more optimistic view of the state of digital preservation (see yesterday’s post). Ross’s reply: ‘But that concerns only a narrow range of object types.’ Databases, for instance, are still a very real problem to deal with.
More good stuff from this densely packed conference in the next few days. About whether OAIS is still helpful, about tools, about file format registries. And about thinking before you act, the New Zealand version.