Friday 12 February 2010

music, planets and secret messages

My head is all a-buzz this morning with thoughts and ideas. There are a several reasons for this, the first being the news that "Google has deleted at least six popular music blogs". I don't want to argue the rights or wrongs of illegal file sharing, that is not why my head is buzzing. Rather, it is that the struggle between of the music industry and the pirates, all riding the waves of the Web, seems to me a struggle that will be of great importance to future scholars trying to work out what happened in the 21st Century to produce the business models of the 25th. Google's actions here could amount to someone burning Amatino Manucci's Summa de Arithmetica.

The second reason is we're just back from "Digital Preservation - The Planets Way", a three day workshop in London put together by the lovely and talented people on the Planets Project. Just looking at the programme you'll see what a treat it was and there are some great things that have come out of the project - including the Plato preservation planning tool, which provides a guided (albeit manual) workflow to build preservation plans for digital objects; and the Testbed, which is a Web-based "Lab" in which you can test preservation actions and record and evaluate the results prior to running those actions locally. Both provide a useful audit trail and hopefully protect a beleaguered digital preservationist from trouble should they find they've been using the equivalent of acid-paper some time in the future - because they can show the actions they took were the best available to them at the time.

Most interesting for me (as a developer) was the Interoperability Framework, which seems to promise simple integrated access to preservation tools and so my head is buzzing with the thoughts of how we might use it. Mind you, I've not delved deeper yet, but I'll let you know what we find out!

Finally, on the train home, I bought a copy of Linux Magazine mostly because the cover announced an article on steganography. Earlier that day we'd been trying out image migrations in the Planets TestBed and I didn't recall seeing any of the characterization tools pointing out (or having a space for) detection of secret messages, although such tools exist (the article suggested they might not work or lead to false positives however). Wikipedia lists some futher paths to follow and this morning I found a report from a conference in 2004 that mentions steganalysis and archives. I guess our forensics machine might have tools to find this sort of thing too.

All of which left the final buzz in my head - I didn't know what migration might do to such messages - say embedding a poem inside a BMP file and then migrating that BMP to TIFF (and then back to a BMP) and seeing if the message is still there. If ever there was a time to try out the Testbed, this would be it! So I'm off to see if the group logins still work! :-)

Addendum:

The login still worked, so I conducted four experiments.

Firstly, I converted this BMP (blogspot has migrated it to a JPG on upload, though it still has its original filename!) to a TIFF and noted that none of the "compare" tools noticed anything odd about the BMP, in spite of the secret poem contained therein and steghide would no longer work as it doesn't support TIFFs.

Secondly, I converted the TIFF back to a BMP (using the same migration tool - GIMP in this instance). Perhaps if you know more about the innards of image formats it will come as no surprise, but I was surprised to discover that migrating back to BMP restored the original hidden message. (Interestingly, doing the same thing on the command line with ImageMagick - BMP->JPG JPG->BMP returned a BMP of identical size to the source, but the hidden message was lost).

Finally I converted the BMP to a JPG (twice - the first migration had some dubious default settings for quality of the resulting JPG) and tried steghide on the results. Unsurprisingly the message was again lost. This is nothing new - you have to decide which characteristics of an object you want to keep and which are OK to lose on migration and a stenographic message is probably just one of those characteristics - albeit a rare one!

This was just a brief moment spent with the Testbed and I rushed through the tests really, but it was pretty interesting. True, I could've probably conducted these experiments faster on the command line using any number of image manipulation tools, but then I would not have a record of my work. Using the Testbed forced me to document the experiment rather than just bash out a few commands and leap to a conclusion and any results I have, should, in theory make their way to the "community".

Finally, if you want to know what the poem is, the passphrase is "boat" and the original image is available here. (You might want steghide too).

3 comments:

Unknown said...

Sounds like you might want to keep an eye on http://planets-suite.sourceforge.net/ over the next six months or so. Also, if you drop the Testbed Helpdesk a line, we should be able to sort you out with a proper login.

Seth said...

It makes perfect sense that the BMP/TIFF round-trip would work and the BMP/JPEG one does not. The BMP->TIFF (and back) conversion works on a pixel to pixel basis as does steghide. So long as there are no issues with conflicting pixel bit-depths or color-spaces you should be able to get exact transformations back and forth. However, the JPeg conversion process breaks the BMP into pixel blocks before performing one of three compression methods; two of which are lossy (see http://blogs.msdn.com/devdev/archive/2006/04/12/575384.aspx). With the first two methods there is no way to get each pixel back exactly as it was; which is a requirement for preserving the steghide message. The lossless JPeg method, which is typically not the default option, will likely retain the message through the round-trip but I would test that hypothesis before being confident.

pixelatedpete said...

Lovelycode - don't worry! I'll be checking out the sourceforge site quite often! :-)

Thanks Seth for the useful explanation! ImageMagick's own documentation recommends not to use lossless JPeg conversion... Interesting!