This week's study group on steganography detection schemes was given by Panos Andriotis and Georgios Oikonomou. While in the physical world it is fairly easy to hide messages beneath or within other, less compromising messages (the ancient Greeks were fairly creative in finding such hiding places), in the digital world it is not quite that simple as you can't simply put one layer of bits onto another. One common trick is to manipulate image data, for example in JPEG files: They encode colours in YCbCr format where Y represents the luma component, Cb is the chroma difference for blue and YCbCr is the chroma difference for red. However, the human eye is better at detecting small differences in the luma component than in the two chroma component so steganography libraries such as JPHS, OutGuess and VSL use these components to embed data into images.
At, SPIE'07, Fu, Shi and Su presented in their paper "A Generalized Benford's Law for JPEG Coefficients and Its Applications in Image Forensics" [FSS07] a method to detect steganography in black and white JPEG images based on Benford's Law. For many sets of numbers gathered in real life, for example the height of buildings, Benford's law gives the probability of the value of the first digit. For example, the first digit is, for many of those number sets, far more likely to be a 1 than a 2. This can be used as statistical check for applications as different as accounting fraud detection, election manipulation detection and - thanks to Fu, Shi and Su - steganography detection as well.
But first a little more on the JPEG compression algorithm: After the original RGB data is (loss-free) translated into YCbCr data, a discrete cosine transform (DCT, loss-free) is applied to each 8-by-8 pixel block, resulting in DCT coefficients. The DCT coefficients are then quantized (this is lossy, i.e. irreversible) before a further loss-free Huffman encoding is applied. In [FSS07] it was shown that Benford's law applies to the DCT coefficients of normal black-and-white pictures as well while pictures that had steganography applied to them follow different probability distributions. For the quantized DCT coefficients however, a generalized version of Benford's law was needed and, depending on the quality factor of the quantization, suitable parameters for the generalized law are given in [FSS07].
Panos, Georgios and Theo were now able to show in a recently accepted paper of theirs that this also holds for colour JPEGs and developed tools which can very efficiently detect potential steganography containing JPEG files with high accuracy (i.e. a low rate of false-negatives) and at high throughput; this is useful to limit the number of files that more precise but considerably slower machine-learning based tests have to analyse. (Outguess, VSL and JPHS were used to apply steganography to the images and the success rates vary; JPHS fared better than the other two.)
P.S. This was actually the study group of January 29th but I must have saved the blog entry instead of publishing it. Sorry for the delay...