CMU Algorithms Finds Gene-Defects
Computational biologists at Carnegie Mellon University (CMU) have devised an algorithm to rapidly sort through mountains of gene expression data to find unexpected phenomena that might merit further study.
One of the key features that sets the algorithm apart is that it re-examines its output, looking for mistakes it has made and then correcting them.
The research was carried out by Carl Kingsford, a professor in CMU’s Computational Biology Department, and Cong Ma, a Ph.D. student in computational biology. This is the first attempt by scientists at automating the search for these anomalies in gene expression inferred by RNA sequencing.
CMU Algorithms Finds Gene-Defects: Using Computational Biology In Genetics
Using this system, the researchers already have detected 88 anomalies that were unexpectedly high or low levels of expression of regions within genes.
Though an organism’s genetic makeup is static, the activity level, or expression, of genes varies significantly over time. Gene expression analysis has thus become a vital tool for biological research, as well as for diagnosing and monitoring cancers.
Anomalies can be important clues for researchers, but until now, finding them has been a painstaking, manual process, sometimes called “sequence gazing.”
Finding one anomaly might require examining 200,000 transcript sequences — sequences of RNA that encode information from the gene’s DNA. Dr. Kingsford said that most researchers, therefore, focus on regions of genes that they think are important, mainly ignoring the vast majority of potential anomalies.
Ma noted that identifying anomalies is often not clear cut. Some RNA-seq “reads,” for instance, are common to multiple genes and transcripts and sometimes get mapped to the wrong one. If that occurs, a genetic region might appear more or less active than expected.
CMU Algorithms Finds Gene-Defects: Algorithms That Rectifies Itself!
So the algorithm re-examines any anomalies it detects and sees if they disappear when the RNA-seq reads are redistributed between the genes.
The Falsely predicted instances of differential expression can be reduced by correcting anomalies with the re-examination method.