Yesterday it was a privilege for us to host the autumn Archivematica User group meeting here at the University of Warwick – the 9th User Group meeting since its inception in 2015. I wasn’t at that meeting but I have been to all of the rest and they are a great opportunity for people who are interesting in, experimenting with or using Archivematica in full production mode to get together and discuss their experiences.
I used host’s privilege kicking off proceedings by giving a brief introduction to where we are at the University of Warwick which I illustrated like this:
It does sometimes feel like we have a mountain to climb. We have various issues with our installation of Archivematica to sort out and then when we’ve got those sorted it it’s on to the really tough stuff! We know that the future of digital archive processes is going to be about dealing with large quantities of material so we need to work on automating as many of our processes as possible. A good place to start on this in is automating capturing descriptive metadata and also as many of the ingest processes as possible. There are so many questions and I hope to be able to report on our progress at future meetings.
Next up we heard from Jenny Mitcham of the University of York at her final Archivematica UK User group meeting before she moves on to pastures new at the Digital Preservation Coalition. Jenny was reporting back on her work on some old WordStar files which form part of the Marks and Gran archive. She has already blogged about her adventures with these files and she came to the meeting to report on her most recent work using the manual normalization function in Archivematica. Jenny emphasised that this work is incredibly time consuming and requires lots of experimentation and QA. The work involved testing migrating the files to different formats – PDF, TXT and DOCX. By comparing the migrated results to an original version of WordStar which Jenny has running on an old machine in the corner of her office she could see that each normalised format captured some of the properties of the original but none of them captured them all. There was a further complication in that some files had the same name (but with a different extensions) which Archivematica does not like. On top of all of this PREMIS metadata has to be added manually to record the event – this gives not entirely satisfactory results in terms of the information that it represents (or doesn’t represent). The whole normalisation process is long and complex and is summarised with a short and not entirely decipherable PREMIS entry. Jenny’s main take away is that Archivematica struggled in situations where the normalisation path was unclear. Any three of the normalised files she produced could be an AIP or a DIP but Archivematica does not allow for more than one of each.
Following these presentations the group had a short discussion on the appraisal tab feature in Archivematica. We had previously asked people to test it out to report back on whether they thought it was a feature they were likely to use or not. We had a relatively small number of people saying they had tried using it and possibly this reflects difficulty of use and/or the fact that the feature was designed for use specifically with ArchivesSpace (and therefore doesn’t necessarily integrate with AtoM or other systems). There also followed some discussion of how much appraisal people were likely to do in Archivematica (as opposed to before ingesting into the system). There was some feeling that it might be more useful if it did integrate with AtoM but this of course would require development work. Food for thought…
There was also a short discussion on “how much” IT support we felt an institution might expect to need to support running an instance of Archivematica. Admittedly this is a bit of a “how long is a piece of string?” question but there were some valuable contributions around how advocacy was needed to engage IT support colleagues which might lead to more of a feeling of ownership and help develop enthusiasm and experience (they go hand in hand). There was also discussion of costings and creating a business case (the Digital Preservation Coalition got a name check here).
After lunch we heard from Hrafn Malmquist from the University of Edinburgh who was updating us on his work automating their Archivematica workflow. We heard at the last meeting about the beginnings of this piece of work creating an integration between Archivematica and DSpace and ArchivesSpace. I was extremely impressed by the way in which the SIP is processed and produces two AIPs, one of which goes through to a dark archive and the other to DSpace. The DIP which is produced is then also pushed to DSpace which then creates a link to ArchiveSpace. So far just getting all the storage and access locations working smoothly is impressive enough but Hrafn says there is more to do – for example the DIP file structure is flat where it should be more hierarchical.
Next up was Matthew Addis talking about his “journey into the FPR”. For many Archivematica Users (at least for those of us who discussed this at the Winter 2017 meeting in Aberystwyth), the Format Policy Registry is a thing to be approached with extreme caution. Archivematica offers the user the option of customising the normalisation pathways although as we saw with Jenny’s presentation approaches to normalisation are extremely complicated and often require a decision making process based on “least worst” options. Matthew’s normalisation work was around Office documents and emails. One example was creating a normalisation path for Powerpoint files to PDF(A) where the process is lossy as animations, fonts, comments and all sorts of other content is lost. Normalising to an Open Document Format might be preferable but this format is not widely supported making the files relatively inaccessible. Analysing files for significant properties is extremely complicated and time consuming and in the end not easy to quantify; how do you measure which particular property has “more” significance than another if you are trying to compare processes. Another challenge was that Archivematica only supports one input format, one tool and one output format and sometimes more than one format and more than one tool might be involved in normalising a file. It was good to be reminded just how complex office documents are and cause no end of a headache for anyone planning for future resilience.
Our final presentation for the day was from Alan Morrison from the University of Strathclyde. He took us through their Research Data Management workflows using Archivematica. They share an instance of Archivematica with their Archives and Special Collections but there is little overlap between the two services. At present Archivematica is used just to create AIPs which are then stored in the local network storage. DIPs are not created because Strathclyde use the front end of their institutional Research Information System (the database which manages all the research outputs) to make the datasets discoverable. Alan recognised that there ongoing issues, not least poor interoperability between systems and too many manual actions which lead to human error. But there was much to look forward to as well such as a possible development of dashboard monitoring to aid management of the AIPs and the development of a plug-in to integrate with an ePrints repository. He also mentioned a possible Scottish Archivematica Hub (given there are a number of Scottish institutions using Archivematica). We’ll definitely be looking forward to hearing more about this in the future.
To wrap up the day we were delighted to hear from Kelly Stewart of Artefactual systems making an early start in Vancouver to give us an update on Archivematica developments at their end. We’re looking forward to the release of Version 8.0 which is imminent and excited to hear about a possible Archivematicamp UK/Europe – are there any interested hosts out there?
Overall I really enjoyed the day – there was a lot to think about and I gave myself a couple of pieces of homework which I must get on with sooner or later.
If you are interested in Archivematica and would like to join the group or just attend a future meeting to be able to chat to fellow users then do get in touch with me rachel_dot_macgregor_at_warwick_dot_ac_dot_uk