Braille Image Recognition Milestone Report Introduction Having a strong desire to work on a project that can really contribute into people’s lives I came up with this idea. And I was really disappointed to find out that there were not much done in Braille recognition. I have read papers describing hardware solutions. I’ve encountered applications for mobile phones where one can manually select a symbol he sees to get it translated. Machine Learning approach seems natural since it gives as an opportunity to get fast and high probability results of recognition. And I have made some steps toward my final goal. It is quite hard to correlate with my proposal schedule. But I feel that I am on track, and will try to do as much as possible and as best as possible by the final poster presentation. Progress Let me apologize first for not being able to appropriately describe my problem in the proposal. Having a very blurred view of the problem, I could not imagine what and how it will work, so I simplified the problem. So now plans have changed. The goal is to recognize Braille symbols in images with the possibility of creating Braille-to-text system. So briefly what I have done: I collected the dataset I extracted features from the all of the symbols in images and divided the dataset into test and training. I tested the predictions with KNN algorithm I started implementing SVM multiclass algorithm. I will now elaborate more on my accomplishments. Dataset I have encountered problems in collecting the dataset. Searching on internet did not give me any results, besides pictures of Braille alphabet. Then I decided to check for Braille books, but changed my mind very fast, because I need to label my data, and I did not want to waste time on manual translation. Fortunately, I found a solution. And thanks a lot to Dartmouth College for giving assistance to disabled people. So I was taking pictures of Braille tablets at Dartmouth (mostly in Sudikoff). I have to say that it was not easy to find text tablets, because most of them had numerical data like room numbers. Still I was able to find test messages. I took all of the pictures with my Nokia N95 phone camera. I took multiple pictures of the same tablet with different settings, angles and quality. The tablets were very different. For example, Sudikoff had dark textured tablets with small prominent dots of the same material. Many other tablets were transparent and glossy. That gave me lots more challenges in extracting features. I had 110 pictures in total. Each of the pictures had 10 symbols on average. But I could not use some symbols, because I wanted to collect only alphabetical data while there are some symbols that indicate two or three letters, I eliminated such symbols. I had to manually label each of the symbols? Most of the times I had to match with the alphabet. What I have found out the few tablets in Dartmouth have mistakes. One more thing to say to be honest id that I was not able to find q, k and z symbols, so I draw them manually. I came up with idea of features I am going to use after analyzing the structure of my data. Braille symbol can be presented as a 3 by 2 table, and each of the cells can have a dot or be empty. So I decide to extract symbols from pictures and, binarize them and then calculate amount of black in a cell normalizing it by cell size. Professor Torresani suggested me to manually crop symbols and calculate image gradient to get high intensity values. While I see a lot of potential in that method, I was not able to properly adapt it to my needs (or maybe my code was bad). But the suggestion of blurring the image before processing it made magic. What I did then is that I used matlab binarization method and by adjusting parameters I was able to get good results on most of the pictures. Small and useful adjustments and modifications gave me good results. Though I understand it might be very simple and not so promising, I was able to successfully extract features. And now I have 1200 sample data. I took 390 samples for training set (15 for each of 26 letters), the rest I left for resting. Each samples has 6 features, each represents amount of black (in terms of cells features are cell11, cell12, cell21, cell22, cell31, cell32). Feature values depend a lot on the visibility of a dot. Even though binarization gave good results, the images don’t always have big circles in place of a dot, the data is pretty diverse, and reflects real life data. Algorithms Collecting, processing and labeling data took most of my time, and as of now I have started implementing my main classification method Multiclass SVM one-versus-one strategy. So I could not present results on that. But I needed to test my data. And I used KNN algorithm. I tried simple Euclidean distance measuring and weighted Euclidean. The latter showed better results with 12 percent error at best with k = 5. I also was reading on Mahalanobis distance metric. I tried to implement it, and was not successful. Feature
work My goal is to get the best prediction results. So I have great hopes on SVM, with my feature size it can be very flexible. So I will run tests to find the best settings for SVM. I also plan to implement Naïve Bayes multiclass classifier. I want to incorporate results of all three algorithms to get the most probable answer. I also want to create interface to be able to load a picture and recognize the text on it. I aim to create an application with good prediction results on Braille images, and though I am interested in searching for the best algorithms and techniques, with the time limit I opt to concentrate on creating the system. |