Seeing double

I’ve been making good progress with processing and ingesting some of our born-digital collections – in particular the records produced by the University. The most difficult thing about this work has been ensuring that we receive the files in the first place! I’ve chosen to make a start on this material because in the main it is predictable (usually Word documents or PDFs of various sorts) and we understand the context of it and in some cases get some additional metadata. We’re very lucky here at Warwick University because we’re well resourced in terms of having a Records Management Team (yes that’s right – there’s more than one person doing it!) plus me and two archives assistants who are able to spend some time on processing and cataloguing. And yes – quite a bit of time is spent on sending emails saying “where are these committee minutes” or “can you send them without password protection” and so on. There is no denying that the capture part is labour intensive before you’ve even started on the digital processing.

There’s a lot of fine tuning to be done in digital preservation and it can be very time consuming

I’ve developed a workflow document for the team here to follow so that the processing is consistent although I am also constantly reviewing and revising our workflows. Digital preservation is not something which can be “achieved” it’s an ongoing process: from fixity checking through to revising workflows and normalising files for preservation and access. You will literally NEVER be done. But don’t let me put you off…

Workflow for initial processing of committee minutes

For these regularly deposited and (relatively!) unproblematic files we have adopted a low-level processing workflow. The selection, appraisal and quite often the arrangement has already been done by the creators so we focus on cataloguing and ingesting the files into the preservation system. A file list (not really a manifest) is created using a dir > b command and used to list the files in the catalogue. This means any one of us can quickly and easily create this type of document. At present I have generally not been including a file manifest as part of any submission documentation – mainly because I’m trying to streamline the process and I would have to add it in manually. Also the file list is captured in the catalogue metadata. I’m not too worried about where the information is captured as long as it’s captured somewhere.

However with some of the legacy files (ie the ones which have been lurking around on the Shared Drive for a year or six) I have more often been needing something a bit more involved. This is in part because the legacy material includes duplicates, surrogates and other versions so at this point I am more likely to be making some appraisal decisions or otherwise document what I have. For these collections I have been making file manifests, usually using DROID. The process of identifying duplicate files (deduplication) and is a key part of management and appraisal decisions. Running a DROID report over the files gives you some great metadata to get started with – it identifies the file types, and gives them a checksum. With the report in csv format you can sort by file type and checksum which gives you instant results for the number of each file type and also allows you to see where there are duplicate checksums (which denotes a duplicate file). This is fine for where you are dealing with 10 or 15 files but does not scale up – when I ran it across the 1,000 or so files I was dealing with I just couldn’t see where the duplicates were that easily.

DROID report csv but so manythe files – ugh!

However thankfully help was at hand courtesy of David Underdown (from the UK National Archives and the csv validator which I hadn’t previously come across. Even better a user-friendly blog post to accompany it which (with rather a lot of of help from David) I created a csv schema which not only reported on duplicates (as errors) in a csv file but also (with an extra tweak) weeded out any null reports where DROID found a container file (eg a folder) which it did not create a checksum for.

Rreport of schema indicating where the duplicate (and therefore) error files occur.

If you want to have a go with this (assuming you’ve got DROID up and running) you can download the CSV validator here and then upload your DROID csv report and a copy of the deduplication schema (copy and paste it from here into a text file and save it somewhere). Hit the validate button and instant deduplication.

Having tried these things out largely whilst “chatting” over Twitter there also followed some great accompanying discussion including a great tip from Angela Beking of Library and Archives Canada who pointed out that you can set filters on DROID profile outputs (I shall be having a go with using this functionality too).

Other people came up with some alternative tools to try (eg TreeSize or HashMyFiles). There are literally hundreds of files out there for performing all sorts of tasks – you can find some described at COPTR (Community Owned digital Preservation Tool Registry – and I would encourage everyone to contribute to COPTR if they find a tool they like that’s useful for a particular aspect of their workflow. Free tools in particular are great for people working with small budgets (and who isn’t doing that?)

Always worth spending time trying to find the right tool to suit your needs.

This all started out with trying to find a way to weed out duplicate files and to do a bit less seeing double but ended up being a conversation and a piece of collaborative work which has certainly helped me see more clearly. My next step is to try and integrate the report outputs of this into my workflows. I hope some of the sharing of this work is helpful to other people too.

Gerald Aylmer Seminar 2019: Digital and the Archive

A sunny day at the National Archives

It’s been a regular-ish part of my calendar for a few years now to attend the Gerald Aylmer Seminar which has been held annually since 2002. This year’s theme was Digital and the Archive – so of great relevance to my current work and anyone engaged with digital preservation. It’s a great opportunity for historians and archivists to get together and share their work and experiences – something that ought to happen rather more often than it does.

In fact I’ve been very interested in how we might start to present our legacy born digital holdings to our users and potential users – what do researchers want from these kinds of sources? Do they yet know themselves? Jane Winters, who was one of the Keynote speakers, has been asking these very questions and it was great to hear about this her outlining some of the challenges of getting researchers to interact with born digital and pointing out some of the difficulties which still remain about capturing born digital resources and making them available.

John Sheridan completed the tripartite keynote, begun with Alice Prochaska looking back on a glittering career in digital librarianship and scholarship, by delivering a “call to arms” to archivists to develop the necessary archival practice to meet the challenges of capturing today’s digital sources (not just preserving yesterdays) and suggested we do not (yet) have the right levels of advocacy to achieve this. I do agree with this although I am not convinced it is an entirely new problem. I also recognise the tensions inherent in constantly both managing legacy collections as well as keeping up with the material which is being produced right now.

The next session focussed on the “hybrid” nature of the archive with Jen Mitcham talking about her work on the Marks and Gran archive which she has blogged about here. This came at a good time for me as I am currently taking my first steps in digital forensic work which I will be blogging about very soon. Something which I really took away from Jen’s talk was reminding us of the “user experience” of working with legacy born digital files (in her case with word processing packages from the 1980s) where the whole design and probably use of the software package was to produce a physical document. This is an important factor to bear in mind when considering (as I am doing) how to represent some digital objects from the archives. The theme of the user interaction was further taken up by the following speaker Professor James Newman from Bath Spa University who had recorded a presentation on his work capturing the user experience in video games (specifically Super Mario Brothers (you can read all about it here!)

In the afternoon we heard about the fascinating work which has been undertaken by Ruth Ahnert on Tudor State networks which opened up huge possibilities using metadata derived from calendars and catalogues, as well as stressing the improtance of linked open data in reconstructing netowrks of these kinds. Again her work is available here to read.

Rachel Foss from the British Library gave a fascinating insight into their “enhanced curation” work where they gather a huge amount of supporting information about the people whose papers they take – groundbreaking and innovative stuff which I am sure there is much to be learnt from, even if we can’t all get a trip to the south of France to record the ambient sounds of the valley where John Berger lived…

It is interesting that the British Library do ask authors questions about their writing practices in terms of engagement with digital technologies, something which gives key insights to understanding the digital collections. It would be interesting to see how this information is represented in the metadata made available to the researcher.

Adrian Glew from the Tate introduces a huge community engagement project which the Tate was involved in the outputs from which have been shared.

The final presentation was from Naomi Wells talking about working with London’s Latin American Community on documenting their experiences. there were some very interesting findings in relation to attitudes towards digital and physical heritage – websites and other digital resources were seen as inherently ephemeral as opposed to physical objects. It was difficult to get the same level of engagement for the digital legacy.

The day ended with a panel “provocation” led by Margot Finn from the Royal Historical Society with Kelly Foster (blue badge guide and wikimedian), Jo Pugh (National Archives) and Jo Fox (Institute of Historical Research) all contributing to a thought provoking discussion. Foster drew out the power of open data and of licensing both to give appropriate credit to voices which are often obscured from the narrative and ended on a call to “open up” data and metadata. It’s something I’m going to take away with me and start acting on!

Memory Makers 2018

Beautiful Amsterdam

I was extremely lucky to be at the Amsterdam Museum for both the Memory Makers: Digital Skills and How to Get Them conference and also the Digital Preservation Awards 2018 , where excellent practice across the sector was recognised and rewarded.

I missed the ePADD workshop in the morning but I did get to meet Josh Schneider later – which was great – so I need to make sure I follow up on my email preservation work so I can bother him with more questions in the future.  That’s one of the really great things about conferences – you get to meet people whose work you have followed and admired – this helps create connections, establishes areas of interest and builds communities. 

The conference was kicked off with an inspiring but impossible to summarise keynote from Eppo van Nissen tot Sevenaer, Director of the Netherlands Institute for Sound and Vision (Beeld en Geluid) in which he encouraged us all to be archivist activists and quoted William Ernest Henley’s poem “Invictus“.  It fired us all up for a conference which explored how digital preservation knowledge is taught, acquired and disseminated.

The first session focussed on teaching “Digital Preservation” (there was quite a bit of discussion about what constituted this and how it was best described in terms of curricula). Eef Masson of the University of Amsterdam who teaches on a Masters Programme on Preservation and Presentation of the Moving Image discussed how the disciplines of film and media studies intersected with and led to collaboration with the traditional archives programmes – to everyone’s benefit. Sarah Higgins from University of Aberystwyth talked frankly about the difficulties of engaging students from humanities backgrounds with digital skills.  Many (although by no mean all) people choosing a career in archives do so because they like “old things” – this struck a chord with to me as I am a medievalist by training and I have learned to “love the bits”.  How did I get there and how can I take others there with me?  It seems there is a need to engage and inspire people with our less tangible digital heritage.  Later that evening on receiving her DPC Fellowship award Barbara Sierman said:

One of my big take aways from the conference was how to engage people with digital preservation and encourage people to get as excited about it as I am!  After Sarah Higgins, Simon Tanner rounded off the session talking us through Kings College’s Digital Asset and Media Management MA which boomed in numbers once they added the word “Media” into the title.  The list of MA student dissertation topics sounded absolutely fascinating and very varied. Tanner explained that they don’t teach Digital Preservation as a module but rather it is woven into the fabric of the degree.

Sharon McMeekin of the Digital Preservation Coalition began the second session of the afternoon by talking through the survey of what kind of training members said they wanted (which might not necessarily be the same as what they ought to be focussing on…).  She encouraged sharing best and worst (!) practice and emphasised that Digital Preservation is a career of continuous learning – something to be aware of when employing someone in that role.  Next was Maureen Pennock of the British Library who illustrated an enviable internal advocacy strategy. She explained:

The final speaker of the day – Chantal Keijsper of the Utrecht Archives – described the “Twenty First Century” skills and competencies needed to realise our digital ambitions.

The evening was taken up with the Digital Preservation Award 2018 which you can read about here. They were all worthy winners and there were many extremely unlucky losers.  Almost all of my nominees won their category – I’m saying nothing beyond re-iterating my love for ePADD – they were very worthy winners in their category!

Jen Mitcham of the DPC and me at the awards ceremony

Day two of the conference was a chance for some of the Award finalists to showcase their work.  First up was Amber Cushing from University College Dublin discussing the research done to try and build the digital information management course at the institution. In a targeted questionnaire aimed at those who had responsibility for digital curation there was a surprising lack of awareness of what digital preservation/curation was and a confusion between digital preservation and digitisation. Next up was Rosemary Lynch who was part of the Universities of Liverpool and Ibadan (Nigeria) project to review their Digital Curation curriculum. Both institutions learnt a lot from the process and enabled them to make changes to their student offer. With support from the  International Council on Archives this project has helped make standard and other resources available in countries where there this can be difficult.  Next was Frans Neggers from the Dutch Digital Heritage Network (Netwerk Digitaal Erfgoed) talking about the Leren Preserveren course launched in October 2017 enabling Dutch students to learn practical digital preservation skills.  They have had excellent feedback from the course: 

I expected that I would learn about digital preservation, but I learned a lot about my own organization, too”

Student on the Leren Preserveren project

and Neggers added that another benefit was raising the profile of the Dutch Digital Heritage Network – often this course was how people got to find out about the organisation.  The final speaker in this session was Dorothy Waugh from Emory University, one of a group of archivists who have developed the Archivist’s Guide to Kryoflux.  I can testify that this is an invaluable piece of work for anyone planning to (in my case) or actually using a Kryoflux device (designed to read obsolete digital media carriers).  The Kryoflux was developed by audio visual specialists and does not come with archivist-friendly instructions:

In the final session we heard some great examples of training and advocacy.  Jasper Snoeren from the Netherlands Institute of Sound and Vision (Beeld en Geluid) talked about their “Knowledge Cafes” where they invite staff to share a drink and learn about curation and preservation. He discussed how to turn a sector into a community: run very focussed training programmes and keep people engaged in between. Puck Huitsing from the Dutch Network of War Collections (Netwerk Oorlogsbronnen)follwed and had a great deal of useful advice which would constitute a blog post in itself although my favourite quote was probably:

Rounding off an extremely useful and successful event, Valerie Jones from the UK National Archives presented the Archives Unlocked Strategic Vision for the sector, tempering this by saying:

If you’re going to innovate, just do it. Don’t write reports. Just go.

Valerie Jones, UK National Archives

I learnt a great deal at this conference and as usual I have added more to my “to do” list, especially around tackling internal advocacy and I can’t wait to start putting this into practice.

Stroopwaffels and coffee
Stroopwaffels and coffee kept me going!

World Digital Preservation Day 2018


When people ask me what I spend my time doing I generally say “digital stuff” or (when I’m being more frivolous) “staring at spreadsheets”. Sometimes I explain a bit further and say – “I try and ensure that old files and digital content will still be usable long into the future” and actually most people think about this and say – gosh – how do solve that problem?

Well the answer is, of course, that it isn’t something you “solve” any more than you would “solve” the problem of looking after archives, manuscripts or physical artefacts.

So on World Digital Preservation Day when we are all thinking about Digital Preservation – what are the challenges which a Digital Archivist faces and what approaches she might be consider taking.

The creators

I actually spend a good deal more of my time than you might imagine talking to people, emailing them and creating guidance and advice.  If you are lucky – as I am – you have people in your organisation who are working with you who can spread the word about the importance of records and data management (and how this hugely enables the preservation process) and try and get the message out to the people who are creating the digital stuff in the first place. This part of the work is extremely time consuming;  capturing the right data or documents in the best formats with the appropriate metadata and contextual information can involve a great deal of to-ing and fro-ing.  It’s really useful to be able to find out how people work and manage their files.  There is no sense writing guidelines for people to follow if it doesn’t reflect their own real life situation.  It just means that people won’t follow them.

Having said that – especially for World Digital Preservation Day – I have updated and promoted some general guidelines for our potential and actual depositors which you can see here. Have a look and tell me what you think! I’m going to be using it to promote to internal and external depositor in the coming year.

The knotty problems

Actually not necessarily one of my knotty problems…

One of the biggest challenges of Digital Preservation – without doubt – is capturing the “stuff”* in the first place. This sounds so obvious but is worth repeating: you can’t save what you don’t have.

I’ve been working with our Records Management team looking at the records which the University produces – minutes of the Council and Senate and other committees; files which are vital for the running of the institution and need to be retained for both legal and historical reasons.

The files contain sensitive (for business and personal data reasons) so we a secure method of transfer which ruled out email (too easy to mis-send – in fact this is the largest cause of inadvertent data breaches). Similarly using a carrier format like a USB drive or a hard drive was potentially insecure and also very clunky and adding in an extra step to the deposit process.  The harder it is to transfer the data the less likely people are to do it, as any systems designer knows…  We turned to our institution’s own file sharing platform which satisfied our information security team and was already familiar to the people creating the files.  Perfect – I thought – especially as we get a notification email once a file has been deposited.  However – and there’s always a however – it appears that uploading the files to the file share changes the last modified date.  Even worse – it appears that it sometimes changes the last modified date. And sometimes it doesn’t.  So what I thought was going to be a straightforward solution turned about to be more of a knotty problem.  Added into the mix is also the question of “how much does it matter if the last modified date is not correct”? We all know they are very unreliable and in this context I anticipate that most of the material will be deposited very soon after creation, so the dates will not be wildly inaccurate. Something for me to work on…

Back to the future

Much of the work I do is working with internal depositors, IT services, Records Management and increasingly with external depositors. However we still have plenty of work to do stitching together the work I do around digital curation and the traditional archives elements of the service.  We have made a start examining our cataloguing practices for how fit they are for the future. There has been a lot of good work done already (for example by the University of California) on guidelines for cataloguing born digital and hybrid collections but it’s still very much a developing field.  The impact and uptake of Records in Context is still unknown and we are not starting from scratch so the approach we are taking is to build on the cataloguing practices which have already been developed over many years here.  At present we are only ingesting small quantities of digital material so we can craft our cataloguing practices to meet this need, but there is going to have to be some radical rethinking to meet large-scale deposits which will start to come soon enough.  This will affect how we use our collections management tools and how we present the catalogue to the public.  I can see some data modelling work taking place to help map relationships between versions and iterations (let me add that to the to do list below).  All this is very much based in traditional archival principles and is going to involve getting everyone in the team onboard.

The community

The Digital Preservation Community is relatively small and disparate but I could not do my job without you (yes that’s all of you out there with an interest in digital preservation!). Whether it is sparking ideas, offering advice and guidance, sharing best and worst practice (Digital Preservationists Anonymous – yes it’s a thing) or just being a sounding board the contribution of countless people out there in helping me is a fantastic thing. And a big shout out to the Digital Preservation Coalition whose invaluable work in enabling and bringing people together does a huge amount to support the community worldwide.  I hope I do my best by sharing my joys and frustrations via Tweets

and my blog.

So much to do

I’ve got an ever increasing list of things I want to do which includes:

  • set up and test ePADD. You can read about my false start here but I’ve now got more RAM so there’s nothing holding me back apart from finding the time
  • undertaking a survey of the datasets in our institutional repository WRAP to see what file formats we are dealing with.  I’ll be interested to see how the results compare to what I found when I was at Lancaster University.
  • get a forensic workstation up and running. It’s on order but I need it in front of me to be any use!
  • revisit my digital asset register – I created this in my first couple of months but I need to return to it to look more carefully and match it to our catalogue and also to some storage management work that I’d like to undertake


So today is World Digital Preservation Day and I will be in Amsterdam celebrating the achievements of many in the community at the Digital Preservation Awards Ceremony.  There are some brilliant nominations for some of the great work done in the last year or so and you can watch it streamed live here.  Meanwhile I need to get on with my to do list and maybe start on a project worthy of a future nomination?

*stuff: technical term for the stuff made of bits and bytes… If you come up with a better term please let me know….

Archivematica UK User Group Meeting November 2018

Modern Records Centre, University of Warwick (Image: Modern Records Centre)

Yesterday it was a privilege for us to host the autumn Archivematica User group meeting here at the University of Warwick – the 9th User Group meeting since its inception in 2015. I wasn’t at that meeting but I have been to all of the rest and they are a great opportunity for people who are interesting in, experimenting with or using Archivematica in full production mode to get together and discuss their experiences.

I used host’s privilege kicking off proceedings by giving a brief introduction to where we are at the University of Warwick which I illustrated like this:

Ain’t no mountain high enough (Image: Pixabay)

It does sometimes feel like we have a mountain to climb. We have various issues with our installation of Archivematica to sort out and then when we’ve got those sorted it it’s on to the really tough stuff!  We know that the future of digital archive processes is going to be about dealing with large quantities of material so we need to work on automating as many of our processes as possible.  A good place to start on this in is automating capturing descriptive metadata and also as many of the ingest processes as possible.  There are so many questions and I hope to be able to report on our progress at future meetings.

Next up we heard from Jenny Mitcham of the University of York at her final Archivematica UK User group meeting before she moves on to pastures new at the Digital Preservation Coalition. Jenny was reporting back on her work on some old WordStar files which form part of the Marks and Gran archive. She has already blogged about her adventures with these files and she came to the meeting to report on her most recent work using the manual normalization function in Archivematica.  Jenny emphasised that this work is incredibly time consuming and requires lots of experimentation and QA.  The work involved testing migrating the files to different formats – PDF, TXT and DOCX. By comparing the migrated results to an original version of WordStar which Jenny has running on an old machine in the corner of her office she could see that each normalised format captured some of the properties of the original but none of them captured them all. There was a further complication in that some files had the same name (but with a different extensions) which Archivematica does not like. On top of all of this PREMIS metadata has to be added manually to record the event – this gives not entirely satisfactory results in terms of the information that it represents (or doesn’t represent). The whole normalisation process is long and complex and is summarised with a short and not entirely decipherable PREMIS entry.  Jenny’s main take away is that Archivematica struggled in situations where the normalisation path was unclear.  Any three of the normalised files she produced could be an AIP or a DIP but Archivematica does not allow for more than one of each.

Following these presentations the group had a short discussion on the appraisal tab feature in Archivematica. We had previously asked people to test it out to report back on whether they thought it was a feature they were likely to use or not.  We had a relatively small number of people saying they had tried using it and possibly this reflects difficulty of use and/or the fact that the feature was designed for use specifically with ArchivesSpace (and therefore doesn’t necessarily integrate with AtoM or other systems). There also followed some discussion of how much appraisal people were likely to do in Archivematica (as opposed to before ingesting into the system). There was some feeling that it might be more useful if it did integrate with AtoM but this of course would require development work.  Food for thought…

There was also a short discussion on “how much” IT support we felt an institution might expect to need to support running an instance of Archivematica. Admittedly this is a bit of a “how long is a piece of string?” question but there were some valuable contributions around how advocacy was needed to engage IT support colleagues which might lead to more of a feeling of ownership and help develop enthusiasm and experience (they go hand in hand). There was also discussion of costings and creating a business case (the Digital Preservation Coalition got a name check here).

String: how long is it? (Image: Pixabay)

After lunch we heard from Hrafn Malmquist from the University of Edinburgh who was updating us on his work automating their Archivematica workflow.  We heard at the last meeting about the beginnings of this piece of work creating an integration between Archivematica and DSpace and ArchivesSpace.  I was extremely impressed by the way in which the SIP is processed and produces two AIPs, one of which goes through to a dark archive and the other to DSpace.  The DIP which is produced is then also pushed to DSpace which then creates a link to ArchiveSpace.  So far just getting all the storage and access locations working smoothly is impressive enough but Hrafn says there is more to do – for example the DIP file structure is flat where it should be more hierarchical.

Matthew Addis suggested this is how we felt about dabbling with the FPR… (Image: Jeff Eaton: CC BY-SA 2.0)


Next up was Matthew Addis talking about his “journey into the FPR”. For many Archivematica Users (at least for those of us who discussed this at the Winter 2017 meeting in Aberystwyth), the Format Policy Registry is a thing to be approached with extreme caution. Archivematica offers the user the option of customising the normalisation pathways although as we saw with Jenny’s presentation approaches to normalisation are extremely complicated and often require a decision making process based on “least worst” options. Matthew’s normalisation work was around Office documents and emails.  One example was creating a normalisation path for Powerpoint files to PDF(A) where the process is lossy as animations, fonts, comments and all sorts of other content is lost.  Normalising to an Open Document Format might be preferable but this format is not widely supported making the files relatively inaccessible.  Analysing files for significant properties is extremely complicated and time consuming and in the end not easy to quantify; how do you measure which particular property has “more” significance than another if you are trying to compare processes. Another challenge was that Archivematica only supports one input format, one tool and one output format and sometimes more than one format and more than one tool might be involved in normalising a file. It was good to be reminded just how complex office documents are and cause no end of a headache for anyone planning for future resilience.

Our final presentation for the day was from Alan Morrison from the University of Strathclyde. He took us through their Research Data Management workflows using Archivematica. They share an instance of Archivematica with their Archives and Special Collections but there is little overlap between the two services. At present Archivematica is used just to create AIPs which are then stored in the local network storage. DIPs are not created because Strathclyde use the front end of their institutional Research Information System (the database which manages all the research outputs) to make the datasets discoverable.  Alan recognised that there ongoing issues, not least poor interoperability between systems and too many manual actions which lead to human error.  But there was much to look forward to as well such as a possible development of dashboard monitoring to aid management of the AIPs and the development of a plug-in to integrate with an ePrints repository. He also mentioned a possible Scottish Archivematica Hub (given there are a number of Scottish institutions using Archivematica).  We’ll definitely be looking forward to hearing more about this in the future.

To wrap up the day we were delighted to hear from Kelly Stewart of Artefactual systems making an early start in Vancouver to give us an update on Archivematica developments at their end.  We’re looking forward to the release of Version 8.0 which is imminent and excited to hear about a possible Archivematicamp UK/Europe – are there any interested hosts out there?

Overall I really enjoyed the day – there was a lot to think about and I gave myself a couple of pieces of homework which I must get on with sooner or later.

If you are interested in Archivematica and would like to join the group or just attend a future meeting to be able to chat to fellow users then do get in touch with me rachel_dot_macgregor_at_warwick_dot_ac_dot_uk

It takes a while to mature


I wrote in my last post about how I was looking for more resource so I can make progress on various outstanding preservation tasks. This is not a speedy process so in the meantime I am looking at ways to help in the search for more resource and also the ways in which I should be deploying the resources I do have. It seems like a good time to write a roadmap which will hopefully help articulate the vision of where we are headed, identify concrete objectives and priorities to help others understand the work we are trying to do.

First of all I would like to undertake some sort of audit of where we are as an organisation. I have long been an advocate of the NDSA Levels of Digital Preservation and if you have met me you have probably heard me banging on about them. I even have them pinned up next to my desk (I stole this idea from Jen Mitcham) alongside my favourite xkcd cartoon

My desk

These are a great starting point but I’m looking for something a bit more in-depth.  This  is where I’ve turned to Maturity Modelling which is a method of assessing where an organisation is at in different areas and scoring to help define where improvements could be made and highlight areas which need the most attention. To help me undertake this assessment I looked at the suggestions on the Digital Preservation Coalition Preservation Handbook and also turned to Twitter, not least because that’s a place where many of those who have developed these models are to be found.



The Digital Preservation Capability Maturity Model referred to above is definitely one I am interested in and can be found here. The Assessing Organisational Readiness toolkit proved harder to track down (as the twitter conversation suggested there was a link rot issue) but I managed to get hold of a pdf version with another call out to Twitter (it would be great if there was some way of hosting it somewhere…).  The AOR toolkit is also very useful; based on the 2009 Jisc AIDA toolkit (also hard to find) and the CARDIO Research Data Assessment. This is also helpful as Warwick’s Research Data team have been developing their own roadmap using CARDIO and we are obviously keen to develop our services in a joined up and collaborative way.  The third suggestion which I’m going to look closely at is the Kenney and McGovern “Five Stages of Digital Preservation ( which was not hard to track down and has its own DOI, giving at least some guarantee that the link rot will be less likely.

I’ve started going through these models and each has different things to offer which are more or less useful to my particular situation. Every institution has its own priorities and ways of working and there is no one approach to digital preservation which will be applicable across the board. The roadmap I want to develop will hopefully help in the following areas:

  • establishing my digital preservation priorities
  • working out how to develop and move forward with preservation activities
  • highlighting areas for collaboration within the organisation
  • raising the profile of digital preservation work within the organisation
  • help make the case for additional resources based on an analysis of our current position

Using my assessment tools I can then identify my stakeholders and work towards a better understanding of where we are as an organisation and how we move forward.

So for now it’s back to my beloved spreadsheets and time to do some scoring!