Quantifying Microbiome Dynamics
Understanding how bacteria in natural systems (such as lakes or the human gut) respond to their environment is a real challenge. In places like this, there is a huge diversity of species and an even greater diversity of DNA. Add in a time component and this is truly a big data problem. Asking questions such as "How do cyanobacteria in lakes respond to environmental fluctuations?" or "How does eating a Western diet for a year change your gut microbiome?" calls for a tool that can operate on massive quantities of DNA sequence data from a variety of sources with reproducibility and computational efficiency.
Using a time series of microbial metagenomes (DNA sequence samples representing all bacterial species in a sample site) I developed a computational workflow to quantify changes in the gene content of microbial communities over time. All programs can be accessed and ready-to-use by cloning a GitHub repository, and scripts easily afford user editing of specific variables (e.g., preferred annotation scheme to use, preferred sequence alignment parameters, etc.) for customization of workflow outcomes.
View my M.S. defense presentation, my thesis, or my GitHub repo for the project.
Here is a poster I presented at the Department of Energy Joint Genome Institute's Genomics of Energy and the Environment conference that describes background motivating this research.