2012 Internship Projects

2012 Summer Internship group photo

The Summer of 2012 CBI hosted interns from:

Dos Pueblos High School in Santa Barbara
California State University, San Bernadino
École polytechnique de l'université de Nantes in Nice, France

 

Projects

Botanicam system

Mentor

Dmitry Federov

Student Interns

Mike Korcha photo
Mike Korcha

Abstract

The Botanicam system is designed for plant image identification backed by the Bisque database. Botanicam’s workflow allows a user to upload an image of a plant to the server via the web interface or mobile application and receive back plant’s information,  such as, genus, species, wikipedia entry, etc. The plant identification is performed on the server by first computing various image features and then using a trained model to classify the input image. We are using a local dataset of bushes from the Coal Oil Point Reserve that contains 11 classes as well as adding a new publicly available dataset from CLEF 2011 which consists of several thousands of images of leaves, trees and bushes. Our project consists of improving classification performance for speed and accuracy, automating model training process and accommodating new datasets and data types.

Probabilistic spatial object representation in databases

Mentor

James Schaffer

Student Interns

Cristobal Guerrero photo
Cristobal Guerrero

Abstract

 

Raster to vector conversions have been traditionally used to speed up spatial queries but there has been no work in the case when the objects are modeled with spatial uncertainty.  Our project will be to design and evaluate methods to convert 2D uncertain spatial representations of image objects from raster to vector formal. This project will be done in Java, JavaScript, SQL and GIS relational databases, but the final chosen 2D representation/algorithms will be implemented in the BISQUE system. The main issue with converting a raster image segmentation to a Vector data structure is the minimization of errors for the represented region. Our uncertainty model will build on the work by Erlend Tøssebro and Mads Nygård . Utilizing their methods and models as a base, we will fit uncertain vector representations to the original raster data. This conversion will allow us to visualize and query the uncertain extent and center of a cell with greater effectiveness than the naive raster representation . The results from this project will be used to to design and evaluate generalized models that can capture and query spatial/morphological uncertainty in 3 or more dimensions and assess their impact on traditional biological analysis.

Predicting Visual Attention Under Varying Camera Focus

Mentor

Karthikeyan S.

Student Interns

Taylor Sanchez photo
Taylor Sanchez

Abstract

A saliency map is the prediction of regions in a photograph (or any visual scene) which captures the visual attention of the viewer. Until recently, most of these predictions have been bottom-up approaches using low-level features. Low-level features can be reliably computed from images which include bright colors, hard edges, and strong contrast. Relatively new algorithms make use of high-level semantic information, such as face, text, people and other object detections to predict visual attention. Some of the recent state-of-the-art advances come from Tilke Judd's work at MIT. Apart from high-level semantics we observe that camera focus plays a significant role in directing visual attention. Our work targets understanding and quantifying the role of camera focus in visual saliency. With the recently available Lytro camera we are able to take a snapshot of the complete light field of the scene which essentially contains multiple images, each with different focused regions. We will have users view all the images and track the eye movements and fixations of the subjects. Further, we compare the results of the visual attention map with our predicted saliency map. This predicted pixelwise saliency map is learned using a support vector machine. Finally we will discern the role of focus on the user’s attention from other semantics. This technique can also be applied to create futuristic autofocus algorithms when object detectors will be built into commercial cameras.

Computer Vision and Robot Control

Mentor

Carlos Torres

Student Interns

Daniel Richman photo
Daniel Richman
Brenna Hensley photo
Brenna Hensley

Abstract

The Microsoft Kinect is a small, mountable device with both a standard (RGB) camera and an infrared sensor that produces a point cloud. The goal of our project is to implement computer vision algorithms that use both types of image data to detect and track various objects. Ultimately we will track objects (e.g., obstacles and game tokens) in real time to autonomously control an iRobot Create, a small and inexpensive robot intended for educational purposes. A second goal is to incorporate gesture recognition using skeletal tracking so that human users may control the robot.

​Time Series Analysis and Classification

Mentor

Nazli Dereli

Student Interns

Regie Felix and Sophie Darcy

Abstract

This summer we are aiming to gain a better understanding of time series analysis and classification. Time series is a sequence of data that is taken in consistent time intervals. Using the data-mining software R, we will cover topics such as decomposition, classification, transformations, model-fitting, forecasting, and machine learning techniques such as decision trees and clustering. We will be applying these techniques to a variety of data sets to determine significant trends and predict future observations.

Improving Part Detection Algorithms using Functional MRI

Mentor

Carter De Leo

Student Interns

Doriane Peresse photo
Doriane Peresse

Abstract

Literature shows that humans can detect people in images better than machines.  After breaking person detection into a four step algorithm, we hypothesize that several combinations using humans and/or machines for these different steps will show that the detection is especially more effective when humans do the features extraction.

Based on this analysis, we are trying to find out if the human brains react any differently when it sees human bodies (or human body parts) compared to when it sees any other kind of image (representing objects, blur, etc...). Using a functional MRI, we record the brain activities of the subject when he sees different type of images.

The next step is to extract the features from the functional MRI so as to create our own detection model and hopefully get better results than the detections algorithms already existing.

Instance Search on a Large Scale Data Set of Videos

Mentor

Niloufer Pourian

Student Interns

Michael Shabasin photo
Michael Shabasin

Abstract

An important need in many situations involving video collections (archive video search, personal video organization, surveillance, law enforcement, protection of brand/logo use) is to find more video segments of a certain specific person, object, or place, given a visual example. We are developing a system that given a collection of test clips and a collection of queries that delimit a person, object, or place entity in some example video, locates for each query clips most likely to contain a recognizable instance of the entity. This algorithm should be invariant to changes in illumination, viewpoint, and scale. We are investigating a system that works on a large scale database containing 70,000 video clips taken from different cameras with 21 topics.