Title: A change-point model for identifying 3'UTR switching by next-generation RNA sequencing.
Authors: Wang, Wei; Wei, Zhi; Li, Hongzhe
Published In Bioinformatics, (2014 Aug 1)
Abstract: Next-generation RNA sequencing offers an opportunity to investigate transcriptome in an unprecedented scale. Recent studies have revealed widespread alternative polyadenylation (polyA) in eukaryotes, leading to various mRNA isoforms differing in their 3' untranslated regions (3'UTR), through which, the stability, localization and translation of mRNA can be regulated. However, very few, if any, methods and tools are available for directly analyzing this special alternative RNA processing event. Conventional methods rely on annotation of polyA sites; yet, such knowledge remains incomplete, and identification of polyA sites is still challenging. The goal of this article is to develop methods for detecting 3'UTR switching without any prior knowledge of polyA annotations.We propose a change-point model based on a likelihood ratio test for detecting 3'UTR switching. We develop a directional testing procedure for identifying dramatic shortening or lengthening events in 3'UTR, while controlling mixed directional false discovery rate at a nominal level. To our knowledge, this is the first approach to analyze 3'UTR switching directly without relying on any polyA annotations. Simulation studies and applications to two real datasets reveal that our proposed method is powerful, accurate and feasible for the analysis of next-generation RNA sequencing data.The proposed method will fill a void among alternative RNA processing analysis tools for transcriptome studies. It can help to obtain additional insights from RNA sequencing data by understanding gene regulation mechanisms through the analysis of 3'UTR switching.The software is implemented in Java and can be freely downloaded from http://firstname.lastname@example.org or email@example.comSupplementary data are available at Bioinformatics online.
PubMed ID: 24728858
MeSH Terms: 3' Untranslated Regions/genetics*; Base Sequence; Cell Line, Tumor; Computational Biology/methods*; Gene Expression Regulation; High-Throughput Nucleotide Sequencing*; Humans; Likelihood Functions; Models, Statistical*; Polyadenylation; RNA Isoforms/genetics; Sequence Analysis, RNA*; Transcriptome