New gene selection algorithm using hypeboxes to improve performance of classifiers
- Authors: Bagirov, Adil , Mardaneh, Karim
- Date: 2020
- Type: Text , Journal article
- Relation: International Journal of Bioinformatics Research and Applications Vol. 16, no. 3 (2020), p. 269-289
- Full Text: false
- Reviewed:
- Description: The use of DNA microarray technology allows to measure the expression levels of thousands of genes in one single experiment which makes possible to apply classification techniques to classify tumours. However, the large number of genes and relatively small number of tumours in gene expression datasets may (and in some cases significantly) diminish the accuracy of many classifiers. Therefore, efficient gene selection algorithms are required to identify most informative genes or groups of genes to improve the performance of classifiers. In this paper, a new gene selection algorithm is developed using marginal hyberboxes of genes or groups of genes for each tumour type. Informative genes are defined using overlaps between hyberboxes. The results on six gene expression datasets demonstrate that the proposed algorithm is able to considerably reduce the number of genes and significantly improve the performance of classifiers. © 2020 Inderscience Enterprises Ltd.
A Markov-blanket-based model for gene regulatory network inference
- Authors: Ram, Ramesh , Chetty, Madhu
- Date: 2011
- Type: Text , Journal article
- Relation: Transactions on Computational Biology and Bioinformatics Vol. 8, no. 2 (2011), p.
- Full Text: false
- Reviewed:
- Description: An efficient two-step Markov blanket method for modeling and inferring complex regulatory networks from large-scale microarray data sets is presented. The inferred gene regulatory network (GRN) is based on the time series gene expression data capturing the underlying gene interactions. For constructing a highly accurate GRN, the proposed method performs: 1) discovery of a gene's Markov Blanket (MB), 2) formulation of a flexible measure to determine the network's quality, 3) efficient searching with the aid of a guided genetic algorithm, and 4) pruning to obtain a minimal set of correct interactions. Investigations are carried out using both synthetic as well as yeast cell cycle gene expression data sets. The realistic synthetic data sets validate the robustness of the method by varying topology, sample size, time delay, noise, vertex in-degree, and the presence of hidden nodes. It is shown that the proposed approach has excellent inferential capabilities and high accuracy even in the presence of noise. The gene network inferred from yeast cell cycle data is investigated for its biological relevance using well-known interactions, sequence analysis, motif patterns, and GO data. Further, novel interactions are predicted for the unknown genes of the network and their influence on other genes is also discussed.
GlobalMIT: learning globally optimal dynamic Bayesian network with the mutual information test criterion
- Authors: Nguyen, Vinh , Chetty, Madhu , Coppel, Ross , Wangikar, Pramod
- Date: 2011
- Type: Text , Journal article
- Relation: Bioinformatics Vol. 27, no. 19 (2011), p.2765-2766
- Full Text: false
- Reviewed:
Twin removal in genetic algorithms for protein structure prediction using low-resolution model
- Authors: Hoque, Md Tamjidul , Chetty, Madhu , Lewis, Andrew , Sattar, Abdul
- Date: 2011
- Type: Text , Journal article
- Relation: IEEE/ACM Transactions on Computational Biology and Bioinformatics Vol. 8, no. 1 (2011), p. 234-245
- Full Text: false
- Reviewed:
- Description: This paper presents the impact of twins and the measures for their removal from the population of genetic algorithm (GA) when applied to effective conformational searching. It is conclusively shown that a twin removal strategy for a GA provides considerably enhanced performance when investigating solutions to complex ab initio protein structure prediction (PSP) problems in low-resolution model. Without twin removal, GA crossover and mutation operations can become ineffectual as generations lose their ability to produce significant differences, which can lead to the solution stalling. The paper relaxes the definition of chromosomal twins in the removal strategy to not only encompass identical, but also highly correlated chromosomes within the GA population, with empirical results consistently exhibiting significant improvements solving PSP problems.
Studies on the structural stability of rabbit prion probed by molecular dyanamics simulations of its wild-type and mutants
- Authors: Zhang, Jiapu
- Date: 2010
- Type: Text , Journal article
- Relation: Journal of Theoretical Biology Vol. 264, no. (2010), p. 119-122
- Full Text: false
- Reviewed:
- Description: Prion diseases are invariabiably fatal and highly infectious neurodegenerative diseases that affect humans and animals. Rabbits are the only mammalian species reported to be resistant to infection from prion diseases isolated from other species (Vorber et.al., 2003). Fortunately, the NMR structure of rabbit prion (124-228) (PDB entry 2FJ3), the NMR structure of rabbit prion protein mutation s173N (PDB entry 2JOH) and the NMR structure of rabbit prion protein I214V [PDB entry 2JOM} were released recently. This paper studies these NMR structures by molecular dyanmaics simulations. Simulation results confirm the structural ability of wild-type rabbit prion, and show that the salt bridge between D177 and R163 greatly contributes to the structural stability of rabbity prion. Crown Copyright Published by Elsevier.
Extended HP model for protein structure prediction
- Authors: Hoque, Md Tamjidul , Chetty, Madhu , Sattar, Abdul
- Date: 2009
- Type: Text , Journal article
- Relation: Computational Biology and Bioinformatics Vol. Jan-Feb 2011, no. (2009 ), p. 234-245
- Full Text: false
- Reviewed:
- Description: This paper presents the impact of twins and the measures for their removal from the population of genetic algorithm (GA) when applied to effective conformational searching. It is conclusively shown that a twin removal strategy for a GA provides considerably enhanced performance when investigating solutions to complex ab initio protein structure prediction (PSP) problems in low-resolution model. Without twin removal, GA crossover and mutation operations can become ineffectual as generations lose their ability to produce significant differences, which can lead to the solution stalling. The paper relaxes the definition of chromosomal twins in the removal strategy to not only encompass identical, but also highly correlated chromosomes within the GA population, with empirical results consistently exhibiting significant improvements solving PSP problems.
Predicting protein protein interfaces as clusters of optimal docking area points
- Authors: Arafat, Yasir , Kamruzzaman, Joarder , Karmakar, Gour , Fernandez-Recio, Juan
- Date: 2009
- Type: Text , Journal article
- Relation: International Journal of data mining and bioinformatics Vol. 3, no. 1 (2009), p. 55-67
- Full Text: false
- Reviewed:
- Description: Abstract: Desolvation property is used here to predict protein-protein binding sites exploiting the fact that lower-valued 'optimal docking area' ODA (Fernandez-Recio et al., 2005) points form cluster at the interface. The proposed method involves two steps; clustering the ODA points and representing ODA points by average ODA values. On 51 nonredundant proteins, results show the success rate improved considerably. Considering only significant ODA, the previous ODA method has obtained a success rate of 65% with overall success rate of 39%. The proposed method improved the overall success rate to 61%. Further, comparable results were found for X-ray and NMR structures.