Simple Geometric Shape Sketch Recognition
PING LIN
The Problem
Free-hand
sketch recognition is a problem that has been studied for a long time
[1-4]. But because of the diversity of the possible target objects to
be recognized, there hasn't been any method that is "the" method
to use; and the commercially available tools are not at a stage that is
matured enough to attract enough peoples' attention. However, as
pen-based digital devices, as PDA, Tablet-PC, become increasingly
popular, the need for sketch recognition only keeps rising.
In
this project, we would like to identify and decide proper
machine-learning techniques to accomplish the task of recognizing
in real-time simple geometric 2D shapes free-hand sketches on a
pen-based device like Tablet PC. The intension is to confine the domain
of recognition to be the simple geometric shapes, like circle,
rectangle, triangle, arrow, etc. so that the complexity of the problem
is not overwhelming with the time constraints, while making the system
general-purpose augmentable as much as possible keeping in mind
that it is desirable that the system could be augmented and generalized
to different symbolic sets and with big alphabets. In the end, simple
schematic graphs consisted of these simple geometric shapes should be
drawn and recognized/beautified by this system.
General Framework
The
input to the system is a timestamped signal, each of the signal point
is a pair of (x,y) coordinates on the screen. But in this project, we
are not going to exploit the temporal structure. First, HMM or Dynamic
Bayes Net is much more complicated to implement and probably a
too-large overkill for simple shape recognition; more importantly,
anything relying on the temporal information will put constraints on
the order strokes are written,which is not so desirable for geometric
shape recognition. So, in this project, the input is merely a 2D image.
The recognition system is consisted of the following main stages:
Preprocessing:
Include
filtering (for instance threshold filtering for connectivity) the input
signal, normalizing for size and slant, compensating
the obvious deficiency of the signal (for instance, endpoints
refinement) so the processed signal is "cleaner" and normalized,
more suitable for the upcoming processing.
Low-level processing:
Need
to obtain some feature information using the normalized 2D image so
that the abstract feature information can be fed to machine learning
method at the high-level processing stage.
If this were a more
general sketch recognition system, then probably there would be some
additional needs at this stage, such as to segment the signal into
strokes or find the corners and then do a stroke-wise classification to
obtain the abstract feature representation of the strokes using a
suitable language.
But it is not needed here since in this
project, each input image is naturally separated by the pause the user
imposes between two objects to be recognized. So, the input is
naturally segmented by the user and during the pause, the system needs
to recognize the shape and output a beautified one. This is the exact
meaning of being called "real-time" of this recognizer.
High-level recognition:
Having
the abstract representation of the input signal, machine learning
techniques can be applied. The result is the recognized shapes and we
need to output the beautified shapes to the screen according the
location and size information obtained in preprocessing stage.
Possible Methods
For
the high-level learning methods, there are a lot of them proposed in
the literature. Some of the them are specific for geometric shape
recognition; some are for general purpose that is suitable both for
geometric shapes and other symbols, including characters, letters, math
symbols and diagram symbols.
The challenge is that while
we'd like to choose a method that is not too specific so that the
system can be augmented easily; too general a method means too high
complexity to finish in time and may be too computational extensive to
be used in real-time recognition, which is the current intended goal of
this project.
The plan is to start with some more general
methods. The current candidate is manifold learning, or more specific,
kernel isomap [5]. If it turns out that this method is not
realistic for the real-time implementation for the current specific
task, simpler methods such as SVM will be resorted to [4,6].
Data Sets
Very
limited sketch recognition datasets are available. And for the ones
that are available [7], they are basically too complicated for this
project. For this project, the training data sets will be generate by
hand since no extensive training is anticipated.
By Milestone
I
will finish the preprocessing and low-level feature extraction, have
tried different high level learning algorithms and decided which
learning method will be used in the final polishment.
Reference
[1] Randall Davis. Sketch Understanding
in Design: Overview of Work at the MIT AI Lab. In Sketch Understanding, Papers from the 2002 AAAI Spring
Symposium, pp.24-31. Stanford, California, March 25-27 2002.
[2] Hammond, T., Eoff, B., Paulson, B., Wolin, A., Dahmen, K., Johnston, J., and Rajan, P. Free-Sketch Recognition: Putting the CHI in Sketching. 26th Annual SIGCHI Conference on Human Factors in Computing Systems (CHI 2008) Works In Progress, Florence, Italy, April, 2008, pp. 3027--3032.
[3] Tevfik Metin Sezgin. Sketch
Interpretation Using Multiscale Stochastic Models of Temporal Patterns.
Ph.D Thesis for Massachusetts Institute of Technology. May 2006.
[4] Michael Oltmans. Envisioning Sketch
Recognition: A Local Feature Based Approach to Recognizing Informal
Sketches. Ph.D Thesis for Massachusetts Institute of Technology.
Cambridge, MA, May 2007.
[5] Choi, H., and Hammond, T. Sketch Recognition based on Manifold Learning. 23rd Annual AAAI Conference on Artificial Intelligence: Student Abstracts, Chicago, Illinois, July, 2008, pp. 1786--1787.
[6]
Heloise Hwawen Hse and A. Richard Newton. Sketched symbol recognition
using zernike moments. In ICPR (1), pages 367-370, 2004. doi:
10.1109/ICPR.2004.1334128. URL
http://csdl.computer.org/comp/proceedings/icpr/2004/2128/01/212810367abs.htm.
[7] ETCHA Sketches. http://rationale.csail.mit.edu/ETCHASketches/.