2013 Internship Projects

The Summer of 2013 CBI hosted interns from:

Dos Pueblos High School in Santa Barbara
California State University, San Bernadino
​Brown University
University of California, Santa Barbara

Projects

Confocal Image Analysis

Mentor

Renuka Shenoy

Student Interns

Brandon Ringsletter
Brandon Ringlstetter
California State University, San Bernardino

Abstract

The purpose of this project is to analyze spatial statistics of cells in confocal images of rabbit retina. These images allow examination of different types of cells at various optical slices in a section of rabbit retina. We try to gain insight on the presence and location of these types of cells in the Ganglion cell layer and the inner-nuclear layer of the retina, using images of sections stained by various macro-molecule and micro-molecule markers. First, we perform segmentation on the images using a combination of the mean-shift algorithm and morphological processing. These segmented cells are then classified based on cell signatures from markers. Depending on the markers used, the classification is done either by examining the intensity of specific markers in the segmented cell or by clustering. Statistics are collected from all the classified cells, and this information makes it possible to analyze spatial patterns that occur among the cells in the images. We approximate our segmented, classified data as a marked point process. We use Ripley's K function to examine the magnitude of clustering at various separations, both between cells of the a given type and between cells of different types. Further, we can inspect K function curves to determine which combination of markers is optimal for subsequent analysis.

TAQOS: A Tweet Analysis Query Ontology System for Topic-Specific Social Media Investigation

Mentor

Petko Bogdanov

Student Interns

Daniel Richman
Daniel Richman
Dos Pueblos High School

Abstract

The spread of information on social media reflects large-scale trends, such as influenza pandemics and the Arab Spring. Prior research at UCSB has shown that topic-specific networks on Twitter (subnetworks involved in categories like business or sports) exhibit distinct behaviors. Analyzing the characteristics of a given topic-specific network can, for instance, yield more accurate predictions of how a new piece of information will spread. For each Twitter user, we can compute a genotype, a numeric representation of interest in each of several topics. To aid in genotyping users, we develop a system to automatically categorize their tweets into one of five topics: Arts & Culture, Business, Politics, Science & Technology, or Sports. Using the tweet text, we query a database of Wikipedia articles. Based on the results, our system scores the tweet’s connection with each topic. We conducted a large-scale survey online to obtain ground truth information for 6,000 tweets. We then evaluated our system by comparing its results with this ground truth [insert numbers here]. The system can next be applied to deeper analysis of topic-specific networks. As one possible future direction, the classifier’s accuracy can be improved by incorporating additional data, such as text from articles linked to from the tweets.

Embedded 3D Systems For Human Action Recognition​

Mentor

Carlos Torres

Student Interns

Isaac Flores
Isaac Flores
University of California, Santa Barbara

Abstract

Sensor networks can now be easily deployed, improved, and used by many applications through advancements in embedded technology. More specifically, cameras can be used intelligently and efficiently by performing application specific computations per image, and then only send interpreted information. The focus of this project is to perform human action recognition within such a sensor network using OMAP/ARM technology and existing frameworks. Kinect cameras will be used along with a BeagleBoard xM microcomputer to track human joints, with the goal of creating a model of how a particular human action can be represented and recognized in real time. The four basic human actions standing, bending over, walking, and sitting will be used initially for creating such a model. The human joints will be tracked using Primesense NiTE algorithms and OpenNI libraries which retrieves data from the Kinect. Using an embedded network will provide a low cost and non-intrusive method for action recognition which can be applied especially for an ICU room where monitoring patients and real-time feedback is very important.

Accurate GPS Image Location Based On Natural Features

Mentor

Dmitry Fedorov

Student Interns

Ryan Kashi
Ryan Kashi
University of California, Santa Barbara

Abstract

By recognizing particular parts of an image, one is often able to distinguish the exact location of where a photo was taken. Applications of this can be used to geotag images without any GPS location on them, in addition to getting directions based on an image as opposed to an address. The user provides an image as input, and the UCSB Bisque (an environment for handling and analysing images) feature service is then utilized to find the scale invariant feature transform (SIFT) features of the image. With a detailed nearest neighbor search utilizing an algorithem constructed to quickly sort through n-dimensional spaces, a number of these features are matched to a database of features collected from 3,016 public geotagged images taken in a 22 by 24 square kilometer region around Santa Barbara. The GPS coordinates with the highest number of matches are then weighed, and a location for the photo is produced along with a certainty measure.

Modeling Epistatic Interactions Using Machine Learning Techniques

Mentor

Petko Bogdanov, Nick Beck

Student Interns

Sky Adams
Sky Adams
Brown University
Regie Felix
Regie Felix
California State University, San Bernardino

Abstract

Epistasis involves interactions within the genome that contribute to the phenotype of an organism. The main goal is to accurately identify regions of the genome that cause a genetic disorder, such as Alzheimer’s and autism. Two factors in particular are involved with these interactions: single nucleotide polymorphism (SNP) genotypes and gene expression levels. We compared the ability of five machine learning methods to find the subset of SNPs that significantly correlates with the phenotype. Our results demonstrate that when using small synthetic data sets, four out of the five methods found the causal SNPs. However, when using more realistic data sets, the methods either became infeasible due to long computation time or yielded inaccurate results due to the large number of uncorrelated SNPs in the data. We are currently developing more refined methods that can find a model in which a set of SNPs and gene expression levels in real datasets correlate with the phenotype. This is a complex problem because there is an extremely large number of SNPs and genes that potentially affect diseases. By creating a more efficient and accurate method, we hope to better predict the genetic causes of various diseases.

Interactive Cell Segmentation Tool

Mentor

Dmitry Fedorov, Diana Delibaltov

Student Interns

Ryan Williams
Ryan Williams
California State University, San Bernardino
Bryan Johnson
California State University, San Bernardino

Abstract

The purpose of this project is to produce “...a tool for cell analysis in 3-D confocal microscopy membrane volumes.” The tool uses the seeded watershed technique to provide the segmentation and predicts uncertain areas for easier identification of areas to be manually corrected by the user. The volumes are of Ascidian embryos because of their close relativity to human embryos and their simplistic layout. We are extending this tool to work in a web-based format to be accessible through the BISQUE system. The BISQUE system has many existing analysis modules that will make the transition to a web-based tool smooth and standardized. The tool's capabilities will include adding seeds to both existing and new labels(a label is each segmented nuclei), merging labels, and saving data in a lineage for later comparison with multiple 3-D volumes. This project uses both EXTJS and EaselJS libraries to support the capabilities and improve performance of the javascript in a browser. The tool is important for manual correction of the watershed output to provide better nuclei detection and segmentation.