Dartmouth College Computer Science
Technical Report series
TR search TR listserv
|By author:||A B C D E F G H I J K L M N O P Q R S T U V W X Y Z|
|By number:||2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986|
In the past few years there has been a tremendous growth in the usage of digital images.
Users can now access millions of photos, a fact that poses the need of having methods that can efficiently and effectively search the visual information of interest.
In this thesis, we propose methods to learn image representations to compactly represent a large collection of images, enabling accurate image recognition with linear classification models which offer the advantage of being efficient to both train and test.
The entries of our descriptors are the output of a set of basis classifiers evaluated on the image, which capture the presence or absence of a set of high-level visual concepts.
We propose two different techniques to automatically discover the visual concepts and learn the basis classifiers from a given labeled dataset of pictures, producing descriptors that highly-discriminate the original categories of the dataset.
We empirically show that these descriptors are able to encode new unseen pictures, and produce state-of-the-art results in conjunct with cheap linear classifiers.
We describe several strategies to aggregate the outputs of basis classifiers evaluated on multiple subwindows of the image in order to handle cases when the photo contains multiple objects and large amounts of clutter.
We extend this framework for the task of object detection, where the goal is to spatially localize an object within an image. We use the output of a collection of detectors trained in an offline stage as features for new detection problems, showing competitive results with the current state of the art.
Since generating rich manual annotations for an image dataset is a crucial limit of modern methods in object localization and detection, in this thesis we also propose a method to automatically generate training data for an object detector in a weakly-supervised fashion, yielding considerable savings in human annotation efforts.
We show that our automatically-generated regions can be used to train object detectors with recognition results remarkably close to those obtained by training on manually annotated bounding boxes.
Ph.D Dissertation. Advisor: Lorenzo Torresani.
Bibliographic citation for this report: [plain text] [BIB] [BibTeX] [Refer]
Or copy and paste:
Alessandro Bergamo, "Methods for efficient object categorization, detection, scene recognition, and image search." Dartmouth Computer Science Technical Report TR2014-764, August 2014.
Notify me about new tech reports.
Search the technical reports.
To receive paper copy of a report, by mail, send your address and the TR number to reports AT cs.dartmouth.edu
Copyright notice: The documents contained in this server are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Technical reports collection maintained by David Kotz.