We’ve long known that the core numerical algorithm in dadi is in principle amenable to acceleration through GPU computing. We have now implemented GPU computing in dadi, and the results are spectacular. We see speed ups of over 100x comparing GPUs and CPUs on the same systems. And enabling this speed requires only a single user command. The new feature is described in our bioRxiv preprint, and it is available in dadi 2.1.0.
Postdoc Paul Blischak just submitted our latest paper with Mike Barker. Paul developed an approach based on convolutional neural networks to infer models of hybrid speciation from phylogenomic data. His approach is among the first to use linkage information along the genome for such inference, and simulations show that it is powerful. Nicely, the CNN is simple enough to train on a laptop, so we hope Paul’s approach will be broadly useful.
We welcome a new PhD student to the group. Linh Tran has a B.A. in Chemistry, but she’s now interested in biology and data science. In the Gutenkunst group, her PhD research will likely focus on applying machine learning approaches to genomic data to infer demographic history and natural selection.
Today was spectacular for past and present members of the group!
First, former PhD student Aaron Ragsdale accepted a position as Assistant Professor in Integrative Biology at the University of Wisconsin – Madison. He’ll be starting in Fall 2021, after spending time at UGA-Langebio.
Lastly, current undergraduate and new alumnus Megan Irby submitted her Honors Thesis, entitled “The Joint Distribution of Fitness Effects of Wild Tomatoes and a Brief Introduction to Linkage in DFE Inference”. After graduation, Megan will be joining the Peace Corps and then plans to attend medical school.
Cells within a tumor are genetically diverse, and low-frequency mutations can have important implications for understanding treatment resistance and tumor evolution. But detecting such mutations is difficult. We developed BATCAVE as an improved algorithm for detecting mutations within tumors by modeling a key aspect of tumor biology. Namely, each individual tumor has its own profile of mutation types that it tends to generate.
In our paper just published in NAR: Genomics and Bioinformatics, we show that our algorithm improves mutation identification and calibration, in both real and simulated data. Moreover, the algorithm is general and can be added to additional variant callers and extended to incorporate additional genomic context.
This project was entirely pushed forward by former PhD student Brian Mannakee. He developed an interest in cancer bioinformatics, trained himself in the tools, and conceived the algorithm. It was my privilege to help guide him through the project!