Issues impacting genetic network reverse engineering algorithm validation using small networks
- Authors: Nguyen, Vinh , Chetty, Madhu , Coppel, Ross , Wangikar, Pramod
- Date: 2012
- Type: Text , Journal article
- Relation: BBA - Proteins and Proteomics Vol. 1824, no. 12 (2012), p. 1434-1441
- Full Text: false
- Reviewed:
- Description: Genetic network reverse engineering has been an area of intensive research within the systems biology community during the last decade. With many techniques currently available, the task of validating them and choosing the best one for a certain problem is a complex issue. Current practice has been to validate an approach on in-silico synthetic data sets, and, wherever possible, on real data sets with known ground-truth. In this study, we highlight a major issue that the validation of reverse engineering algorithms on small benchmark networks very often results in networks which are not statistically better than a randomly picked network. Another important issue highlighted is that with short time series, a small variation in the pre-processing procedure might yield large differences in the inferred networks. To demonstrate these issues, we have selected as our case study the IRMA in-vivo synthetic yeast network recently published in Cell. Using Fisher's exact test, we show that many results reported in the literature on reverse-engineering this network are not significantly better than random. The discussion is further extended to some other networks commonly used for validation purposes in the literature. The results presented in this study emphasize that studies carried out using small genetic networks are likely to be trivial, making it imperative that larger real networks be used for validating and benchmarking purposes. If smaller networks are considered, then the results should be interpreted carefully to avoid over confidence. This article is part of a Special Issue entitled: Computational Methods for Protein Interaction and Structural Prediction.
A Markov-blanket-based model for gene regulatory network inference
- Authors: Ram, Ramesh , Chetty, Madhu
- Date: 2011
- Type: Text , Journal article
- Relation: Transactions on Computational Biology and Bioinformatics Vol. 8, no. 2 (2011), p.
- Full Text: false
- Reviewed:
- Description: An efficient two-step Markov blanket method for modeling and inferring complex regulatory networks from large-scale microarray data sets is presented. The inferred gene regulatory network (GRN) is based on the time series gene expression data capturing the underlying gene interactions. For constructing a highly accurate GRN, the proposed method performs: 1) discovery of a gene's Markov Blanket (MB), 2) formulation of a flexible measure to determine the network's quality, 3) efficient searching with the aid of a guided genetic algorithm, and 4) pruning to obtain a minimal set of correct interactions. Investigations are carried out using both synthetic as well as yeast cell cycle gene expression data sets. The realistic synthetic data sets validate the robustness of the method by varying topology, sample size, time delay, noise, vertex in-degree, and the presence of hidden nodes. It is shown that the proposed approach has excellent inferential capabilities and high accuracy even in the presence of noise. The gene network inferred from yeast cell cycle data is investigated for its biological relevance using well-known interactions, sequence analysis, motif patterns, and GO data. Further, novel interactions are predicted for the unknown genes of the network and their influence on other genes is also discussed.
GlobalMIT: learning globally optimal dynamic Bayesian network with the mutual information test criterion
- Authors: Nguyen, Vinh , Chetty, Madhu , Coppel, Ross , Wangikar, Pramod
- Date: 2011
- Type: Text , Journal article
- Relation: Bioinformatics Vol. 27, no. 19 (2011), p.2765-2766
- Full Text: false
- Reviewed:
Twin removal in genetic algorithms for protein structure prediction using low-resolution model
- Authors: Hoque, Md Tamjidul , Chetty, Madhu , Lewis, Andrew , Sattar, Abdul
- Date: 2011
- Type: Text , Journal article
- Relation: IEEE/ACM Transactions on Computational Biology and Bioinformatics Vol. 8, no. 1 (2011), p. 234-245
- Full Text: false
- Reviewed:
- Description: This paper presents the impact of twins and the measures for their removal from the population of genetic algorithm (GA) when applied to effective conformational searching. It is conclusively shown that a twin removal strategy for a GA provides considerably enhanced performance when investigating solutions to complex ab initio protein structure prediction (PSP) problems in low-resolution model. Without twin removal, GA crossover and mutation operations can become ineffectual as generations lose their ability to produce significant differences, which can lead to the solution stalling. The paper relaxes the definition of chromosomal twins in the removal strategy to not only encompass identical, but also highly correlated chromosomes within the GA population, with empirical results consistently exhibiting significant improvements solving PSP problems.
Extended HP model for protein structure prediction
- Authors: Hoque, Md Tamjidul , Chetty, Madhu , Sattar, Abdul
- Date: 2009
- Type: Text , Journal article
- Relation: Computational Biology and Bioinformatics Vol. Jan-Feb 2011, no. (2009 ), p. 234-245
- Full Text: false
- Reviewed:
- Description: This paper presents the impact of twins and the measures for their removal from the population of genetic algorithm (GA) when applied to effective conformational searching. It is conclusively shown that a twin removal strategy for a GA provides considerably enhanced performance when investigating solutions to complex ab initio protein structure prediction (PSP) problems in low-resolution model. Without twin removal, GA crossover and mutation operations can become ineffectual as generations lose their ability to produce significant differences, which can lead to the solution stalling. The paper relaxes the definition of chromosomal twins in the removal strategy to not only encompass identical, but also highly correlated chromosomes within the GA population, with empirical results consistently exhibiting significant improvements solving PSP problems.