Perfect Steganography

Steganography is the practice of concealing a file, message, image, or video within another file, message, image, or video. https://en.wikipedia.org/wiki/Steganography

But perfect steganography?  What would that look like?  If we have a carrier (in which the message is hidden) and a payload (the message hidden in the carrier) then we would want the following chracteristics:

  • Undetectable:  Examinination of the carrier should give no clue as to whether there is a payload hidden in it.
  • Unextractable: Even if the carrier is known to contain a payload it should be impossible to extract the payload from it.
  • Unremarkable: The carrier should be of a common type, so that the presence of the file does not indicate a payload.

I have given much thought to this because I think these goals could be approached very closely using jpg format as a carrier.  Instantly we satify the third requirement – jpgs are everywhere.  Incredible numbers are transferred across the internet every day, and even more reside on phones and tablets and personal computers.  The presence of a jpg on a device would not indicate that there is a payload in the jpg.

The second point can be addressed by using ‘standard’ encryption routines such as AES128.

At present, there is no known practical attack that would allow someone without knowledge of the key to read data encrypted by AES when correctly implemented. https://en.wikipedia.org/wiki/Advanced_Encryption_Standard

So what about the first point?  How can we hide a message in a jpg file so that no-one can tell that it is there?  Well, that is possible because of the very nature of jpgs.

An image on your computer is usually stored as an array of pixels, with three bytes for each pixel (red, green and blue).  This is very easy for computers to manipulate, but takes up a large amount of space.  For example a 20 mega-pixel image would require 60 megabytes.  This is far too big to send across the internet, or to store on your device if you have a large number.  So jpg format was invented to reduce the size of the file that is stored.

Some image formats (eg png) are considered lossless – the data is compressed but nothing is lost.  Unfortunately this does not reduce the size enough.  So jpg reduces the size by throwing data away.  The more data we throw away the smaller the image – the trick is to throw away the least useful data.  It would be no good to keep the top left corner of your photo and throw away the rest – chances are that the bit you like is somewhere near the middle.  No, data is thrown away by reducing the quality of the image all across the image.

You can imagine a knob that you could turn, the more you turn it the more detail is lost, but the smaller the jpg file gets.  You keep turning the knob until you find a compromise that ou are happy with and save the resulting jpg.  If you have two versions of the same image, one with a lot of detail and one with a little detail no-one would assume that there was a message in one and not the other.   It just looks like one image with two different compression settings.

Interestingly, the quality varies across an image.  If you have an expanse of blue sky, the subtle details of shadinbg in it are not really picked up by the human eye.  So the compression algorithm throws away more data.  If you have lots of edges – a traction engine maybe – then the algorithm throws away less data.  So the amount of data thrown away varies across the image.  And here we have the medium for our message.  If we can create a pattern in the amount of data that is thrown away – a little more here, a little less there, we can add a signal to the image.

We need to do this in a subtle manner, or our image will have wildly varying quality which will be visible to the human eye, and quite likely to the computer scanner.  But if we do do this carefully I believe that we can achieve the first of our bullet points.

I have an example of an application that does just this: SmugglerMac.  It is written for the MacOS platform (for reasons that will become obvious in later posts), but could easily be adapted for other platforms.  I hope to write a series of posts explaining its implementation.

Leave a Reply

Your email address will not be published. Required fields are marked *