Friday 8 June 2012

Sprucing up the TikaFileIdentifier

As it's International Archives Day tomorrow, I thought it would be nice to quickly share some news of a project we are working on, which should help us (and others!) to carry out digital preservation work a little bit more efficiently.

Following the SPRUCE mashup I attended in April, we are very pleased to be one of the organizations granted a SPRUCE Project funding award, which will allow us to 'spruce' up the TikaFileIdentifier tool. (Paul has written more about these funding awards on the OPF site.)

TikaFileIdentifier is the tool which was developed at the mashup to address a problem several of us were having extracting metadata from batches of files, in our case within ISO images. Due to the nature of the mashup event the tool is still a bit rough around the edges, and this funding will allow us to improve on it. We aim to create a user interface and a simpler install process, and carry out performance improvements. Plus, if resources allow, we hope to scope some further functionality improvements.

This is really great news, as with the improvements that this funding allows us to make, the TikaFileIdentifier will provide us with better metadata
for our digital files more efficiently than our current system of manually checking each file in a disk image. Hopefully the simpler user interface and other improvements means that other repositories will want to make use of it as well; I certainly think it will be very useful!