Classifying Hand Written Digits

Peter de Boursac

Digits

1) Introduction

In this project I plan to solve the problem of classifying hand written digits. This has many useful applications in real life, such as automatic cheque depositing, and credit card receipt scanning. The input will be a scanned digital image of a digit. This will be a form of supervised learning, since the output of the classifier will be a label corresponding to the digit in the image. 
(Image source: http://nn.cs.utexas.edu/computationalmaps/figures/figures-jpg/14.4.jpg)

2) Method

In the literature there are many suitable methods to solve the hand written digit classification problem. There are over 70 applications of various methods on the data set I plan to use. Methods include, but not limited to:

Papers using these methodologies are all linked to from http://yann.lecun.com/exdb/mnist/

In addition, various methods of preprocessing result in different test error rates. So there are many permutations for each of the classifier algorithms.

3) Data sets

The MNIST database of scanned digits contains 60,000 training data examples, as well as 10,000 test data examples. This should provide me with adequate data to build a robust classifier, and enough test data to have high confidence in my accuracy. Additionally, there are many papers that have used this data set so my results can fairly be compared to the literature. This data is publicly available for download at http://www.mathworks.com/matlabcentral/fileexchange/27675-read-digits-and-labels-from-mnist-database
 

4) Milestone

By the milestone I plan to have evaluated the various methods and settled on one to implement that is sufficiently accurate, but also within the scope of this project. In addition, I plan to have a rudementary implementation of the algorithm, though minor debugging may be necessary. Fortunately, the project is very scalable, since I could expand to non-preprocessed images, or even a string of digits, which relates to the cheque scanning application. Thus, if by the milestone I found that I have accomplished my intitial goal, I can set a new target to continue my work in the same classifier.