2nd Greater Chicago Area System Research Workshop
RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we have examined 4000+ HIV-1 strains obtained from NCBI’S database for each of the HIV genes, identifying conserved regions at each hypothetical and operational taxonomical unit within the tree. Integrating the wealth of information available from each genome’s record, we are able to observe how conserved regions vary with respect to their distribution throughout the world. This was made possible through the development of a new software tool, developed such that similar analyses can be conducted for any species or gene of interest, not just HIV-1. In addition to the phylogenetic signal which we can recognize from the HIV-1 genomes examined, we can also identify how selection varies across the genome. Taking this evolutionary approach, we have detected regions ideal for targeting by RNAi treatment.
The software system mentioned above provides access to the National Center for Biotechnology Information's (NCBI) GenBank in multiple ways: It converts GenBank data to the FASTA format for for analysis using desktop tools, and it exposes the data in the form of a RESTful web service. We have implemented this system using polyglot approach involving multiple languages (Python and Scala), libraries (Flask and BioJavaX), and persistence mechanisms (text files and MongoDB NoSQL databases).
S. Reisman, C. Putonti, G. K. Thiruvathukal, and K. Läufer. A Polyglot Approach to Bioinformatics Data Integration: Phylogenetic Analysis of HIV-1: Research Poster. 2nd Greater Chicago Area System Research Workshop (GCASR), May 3, 2013, Evanston, IL, USA.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.
Bioinformatics Commons, Computational Biology Commons, Genomics Commons, Programming Languages and Compilers Commons, Software Engineering Commons
© 2008 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.