Matthew Theisen

Data science and beyond

I'm Matt Theisen, a data scientist in the Greater Los Angeles area. I'm interested data science, conceptual modeling, and analytics. Here are some of my projects.

Featured Projects

Using NLP to Find The Magic Words for Resumes: Machine learning applied to resume text.

Los Angeles Neighborhood Ranker: (Beta) An interactive web app which takes user-selected features to rank neighborhoods in Los Angeles.

Data science/statistics

Identifying Customers Using Logistic Regression With glmnet in R: using logistic regression to select customers to market to.

Recursive Clustering Algorithm for Word Cloud Quiz: Implementation of a recursive/hierarchical clustering algorithm that enforces roughly equal cluster sizes.

Traversing The Cancer Genome Atlas: Ongoing project in which I use Python to parse, collect, and analyze data from The Cancer Genome Atlas.


The limits of data: article about how data can and cannot be used.

How I Got Into Data Science: the story of how I went from computational biology researcher to data scientist.

Video production/visual storytelling

Video production: Contest-winning videos I have produced/co-produced for video contests at UCLA.