Mobile Vision
My role at Google Inc. in the Mobile Vision team focuses on people perception.
My role at Google Inc. in the Mobile Vision team focuses on people perception.
Face recognition approaches have traditionally focused on direct comparisons between aligned images, e.g using pixel values or local image features. Such comparisons become prohibitively difficult when comparing faces across extreme differences in pose, illumination and expression. We propose a novel data driven method based on the insight that comparing images of faces is most meaningful when they are in comparable imaging conditions. To this end we describe an image of a face by an ordered list of identities from a Library. The order of the list is determined by the similarity of the Library images to the probe image. The lists act as a signature for each face image: similarity between face images is determined via the similarity of the signatures.
Face recognition is an important problem in computer vision with applications ranging from surveillance to robotics to the organization of personal image collections. In this work we focus on unconstrained face recognition in videos. The setup consists of a set of gallery videos (or images) for each of the identities. During testing, an incoming probe video (or multiple images) is compared to the gallery to determine the most likely match. This area has not received much attention in the past. We first investigate the simplified version of temporally unrelated multiple images, where we explore the representational power of subspace based distance functions. We create new powerful combinations of low-level features and subspace measures.
Visipedia is a joint project between Pietro Perona's Vision Group at Caltech and Serge Belongie's Vision Group at UCSD. Visipedia, short for "Visual Encyclopedia," is an augmented version of Wikipedia, where pictures are first-class citizens alongside text. Goals of Visipedia include creation of hyperlinked, interactive images embedded in Wikipedia articles, scalable representations of visual knowledge, largescale machine vision datasets, and visual search capabilities. Toward achieving these goals, Visipedia advocates interaction and collaboration between machine vision and human users and experts.
Publication: Visual Recognition with Humans in the Loop (ECCV)
The goal is to provide a web-based vase retrieval system, that allows the upload of new vase images and classifies the shape of the vase. Additionally, a list of close matches in terms of the vase shape are returned.
Click on the image for a demo.
Publication: CLAROS - bringing classical art to a global public
Automatically retrieve large numbers of images from the web for specified object classes with high accuracy and without using any user interaction.
The downloaded images including annotation and metadata are available here.
Publication: Harvesting Image Databases from the Web (TPAMI)