Suppose you have an image file on your computer, and that file is
taking up lots of disk space. So what would you like to do, to free
up some disk space? You'd like to **compress** the file. That is,
you want to change how you're representing that image in such a way
so that a) the image doesn't look any different when using the new
representation, and b) the new representation takes up less disk space.

Basically, there are two types of compression, **lossless** and
**lossy**. When we say that a compression algorithm is "lossless,"
that means we can undo the algorithm and **completely** recover
**all** of the original file. No information whatsoever is lost.
GIF (Graphics Interchange Format) and PNG (Portable Network Graphics)
are examples of lossless file formats. A compression algorithm is
"lossy" means that we cannot completely recover the original file
after it's been compressed. Some information has been lost. The trick
with lossy compression is that if you are going to lose information,
you should try to lose information that you wouldn't be able to
*perceive* as missing. JPEG (Joint Photographic Experts Group)
is an example of a lossy file format.

Now, there are 256 shades of gray in the King, ranging from 0 (pure black) to 255 (pure white). Therefore, the image cross-correlogram will have 256 rows and 256 columns (if the image had only 8 shades of gray, the image cross-correlogram would have only 8 rows and 8 columns), and it will start off by having all 0s in its entries. For each pixel in the King's image, I'm going to record that pixel's intensity and the intensity of the pixel right above it. That is, in the image cross-correlogram, I will measure the intensity p(x,y) of a pixel at (x,y) (in the original image) on the horizontal axis of the image cross-correlogram, and the intensity p(x,y+1) of its upstairs neighbor on the vertical axis.

For example, suppose the pixel located at x=100, y = 82 in the King has gray-scale value 208. Then p(100, 82) = 208. Furthermore, suppose the intensity of the pixel directly above it is 190, i.e. p(100, 83) = 190. Then in the image cross-correlogram array, I will increment by 1 the entry located at (208, 190).

When I do this for every pixel in the King's image, what the image
cross-correlogram will tell me will be **how many times** (or the
**frequency**) a pixel of
intensity **A** occurs directly below a pixel of intensity **B**
in the King.
For example, suppose I'm interested in how many times a pixel of
intensity 108 was located directly below a pixel of intensity 132 in
the King.
Well, I'd look at the number located at x=108, y=132 in the image
cross-correlogram. If that number was, say 731, then a pixel of
intensity 108 was located directly below a pixel of intensity 132
exactly 731 times.

Here's the image cross-correlogram for the King:

Notice how things seem to be clustered along the main diagonal. What does this imply about the original image? Remember, the horizontal axis in the image cross-correlogram corresponds to the intensity of a pixel p(x,y), and the vertical axis corresponds to the intensity of a pixel directly above it, p(x,y+1). The clustering tells me that pixels of a given intensity tend to occur near pixels of roughly the same intensity. I.e. pick a pixel at random in the original image of the King. Suppose it has gray-scale value 104. The clustering along the diagonal tells me that chances are good that the pixel directly above the one I chose at random has a gray-scale value that's close to 104.

Ok, we're almost at the point when we can actually compress the King.

By examining the image cross-correlogram, I think we can agree that adjacent pixels tend to have the same gray-scale values. Another way of saying this would be to state that the difference in gray-scale values between a pixel and its neighbor will probably be small. Remember, the main diagonal in the cross-correlogram tells me that there's lots of spatial redundancy in the original image. I want to use this redundancy to compress.

Before we get to the picture ... A pixel will have an intensity
between 0 and 255. Therefore, the largest difference there can be between the
intensity of a pixel and its upstairs neighbor will be 255 - 0 = 255. Also, the
smallest difference there can be is 0 - 255 = -255. Note that I'm **not**
talking about the absolute value of the differences. If I were, I wouldn't
know if the upper pixel, (x,y+1) were brighter or darker than the pixel
at (x,y). I.e. I wouldn't know if p(x,y+1) = p(x,y) + diff or p(x,y+1)=
p(x+y) - diff. So I need to know if the difference is greater or less
than 0.

Now to represent a number between 0 and 255, I need 8 bits. That
means, to represent numbers between -255 and 255, I need 9 bits (**not**
16 bits). So it looks as though I've made matters worse by storing the
differences. I need more bits now (storing differences) than before (just
storing plain ol' pixel intensities). Right? Maybe not, it turns out!
Let's look at the histogram of the differences p(x,y+1)-p(x,y):

Looks like that most of the differences are between, say -31 and 31. Ok, how many bits do we need to represent a number between -31 and 31? Since you need 5 bits to represent a number between 0 and 31, you need 6 bits between to represent a number between -31 and 31. And 6 bits is fewer than 8 bits, and definitely fewer than 9 bits. And we use this for our

In case things are still a little cloudy, let me do a short example that illustrates the "saving the bits" phenomenon. Suppose we have the following 4-by-4 array of pixel intensities:

3 | 200 | 210 | 198 | 179 |

2 | 169 | 160 | 197 | 190 |

1 | 160 | 145 | 190 | 180 |

0 | 205 | 150 | 195 | 150 |

y/x | 0 | 1 | 2 | 3 |

The red numbers refer to the x (horizontal) and y (vertical) coordinates of the intensities.

As it stands now, since the above 16 intensities are all between 0 and 255, to record all 16 intensities I need a grand total of 16 pixels x 8 bits/pixel = 128 bits. (For this example, I'm ignoring the bits needed to record the locations, i.e. where does intensity 197 occur?) Let me make a table of the differences of pixel intensities:

x |
y |
Pixel Intensity |
p(x,y+1)-p(x,y) |

0 | 0 | ||

0 | 2 | ||

1 | 0 | ||

1 | 2 | ||

2 | 0 | ||

2 | 2 | ||

3 | 0 | ||

3 | 2 |

To record the intensities of the eight pixels in the third column (i.e. half the total number of pixels in the 4-by-4 array), we need a total 8 pixels x 8 bits/pixel = 64 bits. (Since these intensities are all greater than 127 and less than 256, I need 8 bits for each.) Now I need to store the differences (in order to recover the other eight pixels). How many bits do I need for these?

Of the eight differences, six of them are numbers between -31 and 31. Therefore, I need at most 6 bits to record each. For the other two differences, -45 and 50, I need 7 bits. So how many bits total do I need?

Therefore, by using this "difference" representation of this array, the total numbers of bits I need in order to recover every pixel intensity is

So I need fewer bits, 114, in the "difference" representation than I do in the original representation (128 bits). And so I have my lossless compression! And for a few of the differences, e.g. 1 and -5, I could squeeze a little more savings by using fewer (how many?) bits to record those.

Admittedly, the above example isn't very dramatic, but it does illustrate the point that you can use spatial redundancy to compress an image.

Now remember that all this was based on the **spatial redundancy**
of the original image. If there wasn't much redundancy, i.e. if the white
dots in the cross-correlogram were not clustered along the diagonal but
rather spread out, then this compression scheme wouldn't work very well.