Research
Current Projects
Modeling differential abundance in the presence of unknown detection effects
Much of my current research is related to differential abundance analysis. This is a common type of analysis run in microbiome science, in which we aim to study how the abundance of biological categories (for example taxa or genes) vary across covariate levels. This is challenging due to unknown detection effects from the technology used to gather abundance data from microbial samples. In Clauset et al., we discuss the type of differential abundance estimands that can reasonably be studied from the high throughput sequencing data that is common in microbiome applications, and introduce a method to estimate and test one such estimand. I have developed a fast approximation to the testing procedure from this method, to be used in settings in which more computationally scalable inference is required. I have also developed a model for gene abundances observed from high throughput sequencing data and shown the difficulties of directly applying methods developed for taxon abundances to this gene abundance data.
Past Projects
Visualization of phylogenetic trees
Microbial evolution is often studied by performing analyes at the level of the microbial genome. However different genes in a single genome can be subject to different evolutionary pressures, which can result in distinct gene-level evolutionary histories. We address the challenge of studying a set of gene-level histories with an interactive visualization method to compare a set of phylogenetic trees. We use a local linear approximation of phylogenetic tree space to visualze estimated gene-level phylogenies as points in a low-dimensional Euclidean space. This can be useful for identifying genes that have evolved differently from the other genes in a genome and for comparing summary genome-level trees estimated with different gene sets.
Statistical modeling of wage heterogeneity
I have worked with Tyler McCormick to develop a Bayesian hierarchical model to study wage heterogeneity. This project uses longitudinal survey data to study variation in wages in low resource settings.