Suppose you have an image file on your computer, and that file is taking up lots of disk space. So what would you like to do, to free up some disk space? You'd like to compress the file. That is, you want to change how you're representing that image in such a way so that a) the image doesn't look any different when using the new representation, and b) the new representation takes up less disk space.
Basically, there are two types of compression, lossless and lossy. When we say that a compression algorithm is "lossless," that means we can undo the algorithm and completely recover all of the original file. No information whatsoever is lost. GIF (Graphics Interchange Format) and PNG (Portable Network Graphics) are examples of lossless file formats. A compression algorithm is "lossy" means that we cannot completely recover the original file after it's been compressed. Some information has been lost. The trick with lossy compression is that if you are going to lose information, you should try to lose information that you wouldn't be able to perceive as missing. JPEG (Joint Photographic Experts Group) is an example of a lossy file format.
For example, suppose the pixel located at x=100, y = 82 in the King has gray-scale value 208. Then p(100, 82) = 208. Furthermore, suppose the intensity of the pixel directly above it is 190, i.e. p(100, 83) = 190. Then in the image cross-correlogram array, I will increment by 1 the entry located at (208, 190).
When I do this for every pixel in the King's image, what the image cross-correlogram will tell me will be how many times (or the frequency) a pixel of intensity A occurs directly below a pixel of intensity B in the King. For example, suppose I'm interested in how many times a pixel of intensity 108 was located directly below a pixel of intensity 132 in the King. Well, I'd look at the number located at x=108, y=132 in the image cross-correlogram. If that number was, say 731, then a pixel of intensity 108 was located directly below a pixel of intensity 132 exactly 731 times.
Here's the image cross-correlogram for the King:
Ok, we're almost at the point when we can actually compress the King.
By examining the image cross-correlogram, I think we can agree that adjacent pixels tend to have the same gray-scale values. Another way of saying this would be to state that the difference in gray-scale values between a pixel and its neighbor will probably be small. Remember, the main diagonal in the cross-correlogram tells me that there's lots of spatial redundancy in the original image. I want to use this redundancy to compress.
Before we get to the picture ... A pixel will have an intensity between 0 and 255. Therefore, the largest difference there can be between the intensity of a pixel and its upstairs neighbor will be 255 - 0 = 255. Also, the smallest difference there can be is 0 - 255 = -255. Note that I'm not talking about the absolute value of the differences. If I were, I wouldn't know if the upper pixel, (x,y+1) were brighter or darker than the pixel at (x,y). I.e. I wouldn't know if p(x,y+1) = p(x,y) + diff or p(x,y+1)= p(x+y) - diff. So I need to know if the difference is greater or less than 0.
Now to represent a number between 0 and 255, I need 8 bits. That
means, to represent numbers between -255 and 255, I need 9 bits (not
16 bits). So it looks as though I've made matters worse by storing the
differences. I need more bits now (storing differences) than before (just
storing plain ol' pixel intensities). Right? Maybe not, it turns out!
Let's look at the histogram of the differences p(x,y+1)-p(x,y):
In case things are still a little cloudy, let me do a short example that illustrates the "saving the bits" phenomenon. Suppose we have the following 4-by-4 array of pixel intensities:
| 3 | 200 | 210 | 198 | 179 |
| 2 | 169 | 160 | 197 | 190 |
| 1 | 160 | 145 | 190 | 180 |
| 0 | 205 | 150 | 195 | 150 |
| y/x | 0 | 1 | 2 | 3 |
As it stands now, since the above 16 intensities are all between 0 and 255, to record all 16 intensities I need a grand total of 16 pixels x 8 bits/pixel = 128 bits. (For this example, I'm ignoring the bits needed to record the locations, i.e. where does intensity 197 occur?) Let me make a table of the differences of pixel intensities:
| x | y | Pixel Intensity |
p(x,y+1)-p(x,y) |
| 0 | 0 | ||
| 0 | 2 | ||
| 1 | 0 | ||
| 1 | 2 | ||
| 2 | 0 | ||
| 2 | 2 | ||
| 3 | 0 | ||
| 3 | 2 |
To record the intensities of the eight pixels in the third column (i.e. half the total number of pixels in the 4-by-4 array), we need a total 8 pixels x 8 bits/pixel = 64 bits. (Since these intensities are all greater than 127 and less than 256, I need 8 bits for each.) Now I need to store the differences (in order to recover the other eight pixels). How many bits do I need for these?
Of the eight differences, six of them are numbers between -31 and 31. Therefore, I need at most 6 bits to record each. For the other two differences, -45 and 50, I need 7 bits. So how many bits total do I need?
Admittedly, the above example isn't very dramatic, but it does illustrate the point that you can use spatial redundancy to compress an image.
Now remember that all this was based on the spatial redundancy of the original image. If there wasn't much redundancy, i.e. if the white dots in the cross-correlogram were not clustered along the diagonal but rather spread out, then this compression scheme wouldn't work very well.