The School of Informatics, Computing, and Engineering Center for Bioinformatics Research Talk
Speaker: Brad Solomon, The Kingsford Group, Carnegie Mellon University
Where: Walnut Room, Indiana Memorial Union
When: Thursday, April 26, 2018, 3:00 pm
Topic: Computational Methods for Database Sequence Search
Abstract: This past decade has seen near exponential growth in the amount of publicly available RNA short read sequencing data. In aggregate, this collection could be a great resource for understanding genetic variation and gene function. However, the vast majority of data is stored as raw, unassembled reads which require significant computational work to assemble or analyze. In this talk I will demonstrate the need for methods that can search raw reads for sequences of interest. I will also present two methods that solve this sequence search problem using novel data structures and approximate search strategies. I use these methods to index the entirety of The Cancer Genome Atlas (TCGA) and present some preliminary results that this index can identify both known and novel relationships between key transcripts and biological phenotypes.
Biography: Brad Solomon is a Ph.D. candidate in the Computational Biology Department at Carnegie Mellon University, working with Carl Kingsford. His research is aimed at making large collections of biological data more accessible and useful. In particular, he is interested in the development of novel algorithms and data structures for the efficient storage, search, and analysis of raw sequencing reads. His past and current research has been supported by the US National Institutes of Health and he is the recipient of a Richard King Mellon Foundation Presidential Fellowship in the Life Sciences. He received his B.A. in Integrated Sciences, Biology, and Computer Science from Northwestern University.