My research interest are broadly in algorithm design and analysis, and I take inspiration from biological problems. Many times this not only leads to an interesting computer science result, but a useful biological tool (see Software).
I am currently a Lane Fellow in the Computational Biology Department at Carnegie Mellon University working with Carl Kingsford.
I was previously a PhD student in the Computer Science Department at the University of Arizona working with John Kececioglu and a student in the CS Department Department at the University of Central Florida working with Shaojie Zhang.
In the past my work has focused mainly on multiple sequence alignment problems. Most recently I worked on improving accuracy of protein multiple sequence alignments. Multiple sequence alignment is a fundamental step in bioinformatics, but the problem is NP-complete. Because of the importance of the result and complexity of the multiple sequence alignment problem many algorithms exist to find high quality alignments in practice. Each of these algorithms has a large number of tunable parameters that can greatly affect the quality of the computed alignment. Most users rely on the default parameter choices, which produce the best alignments on average, but produce poor alignments for some inputs. We developed a process called parameter advising which selects parameter choices that produces a high quality alignment for the input. To accomplish this candidate alignments are produced using each of the parameter choices in an advising set, the accuracy of these candidate alignments is then estimated using an advising estimator, the candidate alignment with the highest estimated accuracy is then selected for the user. To estimate the alignment accuracy we developed Facet (Feature-based accuracy estimator) which is a linear combination of efficiently-computable feature functions. We have found that learning an optimal advisor (selecting both the estimator coefficients and the set of parameter choices) is NP-complete. We expanded this result to show that finding the estimator coefficients or the estimator set independently is also NP-complete. In practice, we have methods to find close-to optimal advisors. We are working on ways to improve the accuracy of these parameter advisors.
I have also worked on improving the memory consumption of secondary structure conscious RNA multiple sequence alignment (see PMFastR) and high throughput phylogeny filtering (see SiClE).
11 years ago this summer I participated in an Research Experience for Undergraduates (REU) program in Computer Vision at UCF. This summer marks the 30th time they conducted the program (which is an impressive feat to be able to get 30 years a funding from the NSF). To celebrate they are having a one day event at UCF on 18 July. The REU in 2006 was my first experience with research, and while my interests didn’t lie in computer vision, and it was my first step to my current career path. I am excited to go back to UCF to participate in this celebration.
Event page: here
The journal version of my WABI paper has been accepted for publication in the Algorithms for Molecular Biology journal and will appear there soon. This extended version includes many of the algorithmic details that were left out of the original paper WABI publication as well as an expanded results section. A pre-publication version of the submission is available on my Publications page.
My paper titled “Boosting alignment accuracy by adaptive local realignment” (with John Kececioglu) was accepted to the 21st International Conference on Research in Computational Molecular Biology (RECOMB) 2017 in Hong Kong in May. The paper is available here.
The votes are in, and I have been elected the ISCB Student Council Representative to the ISCB Board of Directors. This role is to be the liaison between the executive team of the student council and the board directors of the ISCB (our parent organization). I will officially take the post in January but I am already looking forward to further strengthening the cooperation between ISCB and the student council. Thank you all who voted!
The paper describing a software package I created with Jennifer H. Wisecaver (now at Vanderbilt University) while we were both PhD students at the University of Arizona has been accepted for publication in the journal PeerJ. Even as a preprint on arXiv it was cited multiple times (included by papers in Nature Communications and PNAS).
The application SiClE (for Sister Clade Extractor) was created to perform high-throughput phylogenetic analysis. Given a tree and a search term it first determines if the search term is monophyletic in the tree then identifies the two sister clades. It has been used successfully as an initial filtering step to investigate horizontal gene transfer at the high-throughput scale. The program is open source and freely available under a Creative Commons License at http://eebweb.arizona.edu/sicle/.
Read the paper at https://peerj.com/articles/2359/.