November 1998/Steganography: Hiding Information in Plain Sight

Features

Steganography: Hiding Information in Plain Sight

Dwayne Phillips

Sometimes the best form of encryption is to avoid the challenge to would-be decryptors.

Steganography is the art of hiding information. It includes techniques to hide an image, a text file, or even an executable program inside a "cover" image without distorting the cover image. In this article I discuss the basic ideas of steganography and show how to hide text within an image via watermarking; I also show how to hide one image within another, and present the source code needed to implement these techniques. Extensions to these ideas are also available for those interested in augmenting the code shown. Further information on steganography is available in [1] and related web sites.

Hidden Writing

The word steganography comes from the Greek and literally means "hidden writing." People have used steganography through the centuries to hide messages. The messages are hidden in plain sight in that they are visible to people who know where to look.

Consider the sentence "Where really interesting technical exchanges can overcome dull entertainment." This sentence represents a very primitive form of stenography. The first letter of each word spells the message "write code." This message is not well hidden. Better hiding methods use the second or third letter of each word; or the first letter of the first word, second letter of the second word, etc.

Steganography and cryptography are closely related. Cryptography scrambles a message to produce something that looks scrambled. The "write code" message in the above example could be scrambled to be "xsjuf dpef" (replace each letter with the letter that follows it in the alphabet). The scrambled appearance sometimes encourages prying eyes; some people see unscrambling such a message as a challenge. Steganography instead hides a message in a cover message. The result looks like something innocent, so prying eyes often dismiss it. Lawyers and libertarians debate if steganography is close enough to cryptography to regulate its use. To date, steganography remains unregulated.

Stenography Via Watermarking

A watermark adds information to a document or image by placing a logo or seal in plain sight. The watermark protects the owner's rights by showing ownership. TV broadcasters commonly practice watermarking by placing their logo in a corner of the broadcast picture.

A watermark can also be hidden in an image. Hiding the watermark does not change the appearance of the image. This protects the owner's rights, without disturbing the image.

Images 1 through 4 show an example of hiding a watermark. Image 1 shows a boy and image 2 shows a watermark. The watermark is white words on a black background. It is possible to use more complex watermarks, but white on black simplifies the program.

Image 3 is the result of laying the watermark on top of the boy image. A value of 20 was added to each pixel of the boy image where the watermark image was white. This example did not hide the watermark.

Image 4 shows the result of hiding the watermark on the boy image. A value of 2 was added to each pixel of the boy image where the watermark image was white. This small increase is not visible to the casual observer.

It is simple to recover the watermark by subtracting the original boy image (image 1) from image 4.

Figure 1 shows the source code that hides a watermark in an image and recovers it. All the source code presented here works with TIFF files [2]. The first part of Figure 1 is the program that creates the hidden image. After interpreting the command line, the code reads the headers of the two TIFF image files, ensures the images are the same size, and allocates two arrays to hold the images. The hiding operation adds a factor to the image when the watermark image is non-zero. The last few lines of code write the result to a file and free the memory allocated for the image arrays.

The source code for the routines read_tiff_header, bread_tiff_image, bwrite_tiff_image, and free_image_array, as well as the image header structures and constants are all part of my C Image Processing System (CIPS [2]). These routines are not shown here, but are available on the CUJ ftp site in this month's download.

The second part of Figure 1 shows the source code to subtract the original image from the watermarked image. It is very similar to the previous source code.

Hiding Images in Images

Steganography enables hiding one image within another image. The message image is the image to be hidden; the cover image is the image that will contain the hidden image. The hiding process alters the cover image, but the alterations are too slight to see. The hiding process permits recovery of the message image at a later date, such that the recovered message image matches the original exactly.

Perfect recovery is possible because images, as a rule, contain an excess of information. For example, common eight-bit gray-scale images can contain 256 shades of gray, but people can distinguish only about 40 shades of gray. The extra gray shades are useless. A similar situation holds for color images. Images that use 24 bits per pixel can contain up to 16 million unique colors — too many to be useful, as far as the human eye is concerned.

Since eight-bit gray-scale images have more bits than needed, steganography uses the unneeded bits to hide the message image. The stenography hiding process stores the bits from the message image in the least significant bits of the cover image. No one can see the difference in the altered cover image, because no one can tell the difference between a 212 and a 213 gray scale.

Figure 2 shows how three pixels from a message image hide in a cover image. The first part of the figure shows the three pixels from the message image. The second part of the figure shows three rows, of eight pixels each, from the cover image. The last part of the figure shows the same three rows after hiding the three message image pixels within them. The least significant bits of the cover image are holding the message image pixels.

The pixel 99 (decimal) from the message image contains bits 0110 0011. To hide the 0 (the first bit) requires clearing the least significant bit of the first pixel of the cover image. The 90 pixel of the cover image remains 90 because its least significant bit is already a 0.

The next two bits of the message image pixel are 1, so the algorithm sets the least significant bit of the cover image's next two pixels to 1. Thus, the cover image's 82 pixel (52 hex) becomes an 83, and its 88 pixel (58 hex) becomes an 89.

This process continues until every pixel in the cover image has its least significant bit cleared or set, depending on the bit values of a pixel in the message image.

This process has an inherent eight-to-one limitation: since each pixel in the message image requires eight different pixels in the cover image for representation, the cover image must be eight times wider than the message image.

Images 5 through 8 illustrate the process of hiding a message image in a cover image. Image 5 is the message image and image 6 is the original cover image. Image 7 is the cover image that results after hiding the message image within it. Images 6 and 7 are indistinguisable by visual inspection. The difference becomes apparent only when examining the pixel values, as in Figure 2. Many of the pixel values of Image 7 are one off those in Image 6.

Image 8 shows the message image after uncovering it from Image 7. Images 5 and 8 are exactly alike. The hiding and uncovering process did not alter the message image.

Figures 3 and 4 show the source code that produced images 5 through 8. The listings show only the subroutines that do the work. The main calling routine is not shown.

Figure 3 shows the subroutines hide_image and hide_pixels. The hide_image routine reads the message and cover images, calls the hide_pixels routine, and writes the result to the cover image file. The h_counter loop runs through the width of the message image. These routines assume that the main calling routine (not shown) has already checked that the cover image is eight times wider than the message image.

The hide_pixels routine does most of the work in the hiding operation. It determines the value of every bit in every pixel in the message image. It then sets or clears the least significant bit of every pixel in the cover image accordingly. hide_pixels uses two mask arrays to determine and alter bits. The loop over i covers all the rows of the message and cover images. On each row, the loop over j examines each of the eight bits in the message image's pixel. The code then sets or clears the least significant bit of the corresponding pixel of the cover image.

The if(lsb) code is necessary because some TIFF images place the least-significant bit first while others place it last. (It's the old bit order, a.k.a. endian, issue. The difference is easily seen in Intel and Motorola microprocessors.) Depending on the bit order, hide_pixels uses either mask1 or mask2 to set or clear bits.

Figure 4 shows the subroutines uncover_image and uncover_pixels. These routines reverse the hiding process, so they are similar to the routines in Figure 3. The uncover_image routine reads the cover image, calls uncover_pixels for every pixel in the image, and writes the recovered message image to disk.

The uncover_pixels routine does most of the work in Figure 4. uncover_pixel must determine if the least significant bit of each pixel in the cover image is 1 or 0. It then uses these bits to build up the eight bits in every pixel in the message image. The loop over i runs through every row in the images. The loop over j looks at eight pixels in the cover image. If a pixel is odd, its least significant bit is 1, so the corresponding bit in the message image must be set using the mask1 bit mask. Clearing bits is not necessary because the new_message variable was set to 0x00 prior to the loop over j.

Extending the Stenography Technique

There are several ways to extend the concepts presented here, such as increasing the storage efficiency, or hiding executable programs and text files in images. The most obvious limitation to the image hiding technique shown earlier is that the cover image must be eight times wider than the message image. This means using a narrow message image (Image 5) and a wide cover image (Image 6).

You can reduce this ratio down to three-to-one. Instead of using the least significant bit of the cover image, use the two least significant bits. The cover image may then change from gray shade 128 to 131 when hiding the message image. People still cannot see that. The other way to increase efficiency is to reduce the message image from eight-bit pixels to six-bit pixels. The message image will now comprise 64 shades of gray instead of 256. People can see only 40 shades of gray, so 64 is plenty. The six-bit pixels in the message image are distributed in two bits fragments in the cover image, hence the three-to-one ratio. Implementing this scheme would require changes in the routines shown in Figures 3 and 4.

Steganography also enables hiding executable programs inside images. In the previous discussion, the message image was a series of eight-bit values. An executable program is also a series of eight-bit values. The least significant bits of the cover image can just as easily hold the bits of an executable program. The cover image must contain eight times more pixels than the executable length in bytes (four times more pixels if you use the two least significant bits as explained earlier). Uncovering the executable program from the cover image is just like uncovering the message image.

By the same logic, the cover image can also hide a text file, if a text file is considered a series of eight-bit bytes. Again, the cover image must contain eight times more pixels (or four times) than the text message. This use of steganography allows you to hide a message in an image, send the image to a friend (ftp or web site), and have them read it. The whole world can see the image without reading the message or even suspecting a message exists.

Summary

Steganography is a technique used to hide information within images. Using stenography, watermarks and copyrights can be placed on an image to protect the rights of its owner without altering the appearance of the image. Almost like magic, images, executable programs, and text messages can hide in images. The cover image does not appear to be altered. People look at the cover image and never suspect something is hidden. Your information is hidden "in plain sight." o

Notes and References

[1] Neil F. Johnson. "Exploring Stenography: Seeing the Unseen," Computer, February 1998, pp. 26-34. http://patriot.net/ johnson/Steganography.

[2] This source code has been designed to work with my C Image Processing System, which is a gray-scale image processing system based on TIFF files. A text file describing the CIPS series of CUJ articles, and my book, Image Processing in C (R&D books, 1994, ISBN 0-13-104548-2), is available in this month's download section. The full source code to the CIPS system is also available here.

Dwayne Phillips works as a software, systems, and computer engineer with the United States Government. He has a Ph.D. in Electrical and Computer Engineering from Louisiana State University. His interests include computer vision, artificial intelligence, software engineering, and programming languages.