A robust ensemble regression model for reconstructing genetic networks
- Gamage, Hasini, Chetty, Madhu, Lim, Suryani, Hallinan, Jennifer, Nguyen, H.
- Authors: Gamage, Hasini , Chetty, Madhu , Lim, Suryani , Hallinan, Jennifer , Nguyen, H.
- Date: 2023
- Type: Text , Conference paper
- Relation: 2023 International Joint Conference on Neural Networks, IJCNN 2023 Vol. 2023-June
- Full Text: false
- Reviewed:
- Description: Genetic networks contain important information about biological processes, including regulatory relationships and gene-gene interactions. Numerous methods, using high-dimensional gene expression data have been developed to capture these interactions. These gene expression data, generated using high-throughput technologies, are prone to noise. However, most existing network inference methods are unable to cope with noisy data, making genetic network reconstruction challenging. In this paper, we propose a novel ensemble regression model combining quantile regression and cross-validated Ridge regression, RidgeCV, to infer interactions from noisy gene expression data. The application of quantile regression to GRN inference is novel, and its design makes it appropriate for noisy data. RidgeCV also addresses other important issues, such as data overfitting and multicollinearity. First, each regression method is independently applied to gene expression data and the output of these methods, in the form of ranked gene lists, is aggregated using a novel gene score-based method by considering the gene rank and model importance. The model importance score is evaluated based on an adjusted coefficient of determination. This method implicitly includes majority voting by averaging each gene score value across all models. The proposed model was tested on the DREAM4 datasets and publicly available small-scale real-world network datasets. Experiments with noisy datasets showed that the proposed ensemble model is more accurate and efficient than other state-of-the-art methods. © 2023 IEEE.
An efficient boolean modelling approach for genetic network inference
- Gamage, Hasini, Chetty, Madhu, Shatte, Arian, Hallinan, Jennifer
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Arian , Hallinan, Jennifer
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2021, Virtual, Online, 13-15 October 2021, 2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2021
- Full Text:
- Reviewed:
- Description: The inference of Gene Regulatory Networks (GRNs) from time series gene expression data is an effective approach for unveiling important underlying gene-gene relationships and dynamics. While various computational models exist for accurate inference of GRNs, many are computationally inefficient, and do not focus on simultaneous inference of both network topology and dynamics. In this paper, we introduce a simple, Boolean network model-based solution for efficient inference of GRNs. First, the microarray expression data are discretized using the average gene expression value as a threshold. This step permits an experimental approach of defining the maximum indegree of a network. Next, regulatory genes, including the self-regulations for each target gene, are inferred using estimated multivariate mutual information-based Min-Redundancy Max-Relevance Criterion, and further accurate inference is performed by a swapping operation. Subsequently, we introduce a new method, combining Boolean network regulation modelling and Pearson correlation coefficient to identify the interaction types (inhibition or activation) of the regulatory genes. This method is utilized for the efficient determination of the optimal regulatory rule, consisting AND, OR, and NOT operators, by defining the accurate application of the NOT operation in conjunction and disjunction Boolean functions. The proposed approach is evaluated using two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network. Although the Structural Accuracy is approximately the same as existing methods (MIBNI, REVEAL, Best-Fit, BIBN, and CST), the proposed method outperforms all these methods with respect to efficiency and Dynamic Accuracy. © 2021 IEEE.
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Arian , Hallinan, Jennifer
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2021, Virtual, Online, 13-15 October 2021, 2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2021
- Full Text:
- Reviewed:
- Description: The inference of Gene Regulatory Networks (GRNs) from time series gene expression data is an effective approach for unveiling important underlying gene-gene relationships and dynamics. While various computational models exist for accurate inference of GRNs, many are computationally inefficient, and do not focus on simultaneous inference of both network topology and dynamics. In this paper, we introduce a simple, Boolean network model-based solution for efficient inference of GRNs. First, the microarray expression data are discretized using the average gene expression value as a threshold. This step permits an experimental approach of defining the maximum indegree of a network. Next, regulatory genes, including the self-regulations for each target gene, are inferred using estimated multivariate mutual information-based Min-Redundancy Max-Relevance Criterion, and further accurate inference is performed by a swapping operation. Subsequently, we introduce a new method, combining Boolean network regulation modelling and Pearson correlation coefficient to identify the interaction types (inhibition or activation) of the regulatory genes. This method is utilized for the efficient determination of the optimal regulatory rule, consisting AND, OR, and NOT operators, by defining the accurate application of the NOT operation in conjunction and disjunction Boolean functions. The proposed approach is evaluated using two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network. Although the Structural Accuracy is approximately the same as existing methods (MIBNI, REVEAL, Best-Fit, BIBN, and CST), the proposed method outperforms all these methods with respect to efficiency and Dynamic Accuracy. © 2021 IEEE.
Ensemble regression modelling for genetic network inference
- Gamage, Hasini, Chetty, Madhu, Shatte, Adrian, Hallinan, Jennifer
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Adrian , Hallinan, Jennifer
- Date: 2022
- Type: Text , Conference paper
- Relation: 2022 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2022, Ottawa Canada, 15-17 August 2022, 2022 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2022
- Full Text: false
- Reviewed:
- Description: An accurate reconstruction of Gene Regulatory Networks (GRNs) from time series gene expression data is crucial for discovering complex biological interactions. Among many different approaches for inferring GRNs, there are several methods which produce high false positive interactions, and are unstable, requiring fine tuning for many of their parameters. In this paper, we consider the GRN inference problem as a regression problem, and propose a simple ensemble regression-based feature selection model which is a combination of cross-validated Lasso and cross-validated Ridge algorithms for reconstructing GRNs. Due to the novelty of the proposed ensemble model, it is able to eliminate overfitting, multi co-linearity issues, and irrelevant genes within one computational approach. While observing the type of gene-gene regulatory interactions the regression model also identifies the direction of these interactions. A new coefficient of determination (R2)-based approach identifies the best model to fit the data among LassoCV and RidgeCV, and evaluates the model importance in term of gene-wise maximum in-degree which decides the maximum number of regulatory genes including self-regulations that can be selected from a given method. Then, an evaluated gene score-based majority voting technique aggregates the selected gene lists from each method. In our experiments, the performance of the proposed ensemble approach was evaluated using gene expression datasets from three small-scale real gene networks. Our proposed model outperformed other state-of-the-art methods, producing high true positives, reducing false positives, and obtaining high Structural Accuracy, while maintaining model stability and efficiency. © 2022 IEEE.
Filter feature selection based boolean modelling for genetic network inference
- Gamage, Hasini, Chetty, Madhu, Shatte, Adrian, Hallinan, Jennifer
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Adrian , Hallinan, Jennifer
- Date: 2022
- Type: Text , Journal article
- Relation: BioSystems Vol. 221, no. (2022), p.
- Full Text:
- Reviewed:
- Description: The reconstruction of Gene Regulatory Networks (GRNs) from time series gene expression data is highly relevant for the discovery of complex biological interactions and dynamics. Various computational strategies have been developed for this task, but most approaches have low computational efficiency and are not able to cope with high-dimensional, low sample-number, gene expression data. In this paper, we introduce a novel combined filter feature selection approach for efficient and accurate inference of GRNs. A Boolean framework for network modelling is used to demonstrate the efficacy of the proposed approach. Using discretized microarray expression data, the genes most relevant to each target gene are first filtered using ReliefF, an instance-based feature ranking method that is here applied for the first time to GRN inference. Then, further gene selection from the filtered-gene list is done using a mutual information-based min-redundancy max-relevance criterion by eliminating irrelevant genes. This combined method is executed on resampled datasets to finalize the optimal set of regulatory genes. Building upon our previous research, a Pearson correlation coefficient-based Boolean modelling approach is utilized for the efficient identification of the optimal regulatory rules associated with selected regulatory genes. The proposed approach was evaluated using gene expression datasets from small-scale and medium-scale real gene networks, and was observed to be more effective than Linear Discriminant Analysis, performed better than the individual feature selection methods, and obtained improved Structural Accuracy with a higher number of true positives than other state-of-the-art methods, while outperforming these methods with respect to Dynamic Accuracy and efficiency. © 2022 Elsevier B.V.
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Adrian , Hallinan, Jennifer
- Date: 2022
- Type: Text , Journal article
- Relation: BioSystems Vol. 221, no. (2022), p.
- Full Text:
- Reviewed:
- Description: The reconstruction of Gene Regulatory Networks (GRNs) from time series gene expression data is highly relevant for the discovery of complex biological interactions and dynamics. Various computational strategies have been developed for this task, but most approaches have low computational efficiency and are not able to cope with high-dimensional, low sample-number, gene expression data. In this paper, we introduce a novel combined filter feature selection approach for efficient and accurate inference of GRNs. A Boolean framework for network modelling is used to demonstrate the efficacy of the proposed approach. Using discretized microarray expression data, the genes most relevant to each target gene are first filtered using ReliefF, an instance-based feature ranking method that is here applied for the first time to GRN inference. Then, further gene selection from the filtered-gene list is done using a mutual information-based min-redundancy max-relevance criterion by eliminating irrelevant genes. This combined method is executed on resampled datasets to finalize the optimal set of regulatory genes. Building upon our previous research, a Pearson correlation coefficient-based Boolean modelling approach is utilized for the efficient identification of the optimal regulatory rules associated with selected regulatory genes. The proposed approach was evaluated using gene expression datasets from small-scale and medium-scale real gene networks, and was observed to be more effective than Linear Discriminant Analysis, performed better than the individual feature selection methods, and obtained improved Structural Accuracy with a higher number of true positives than other state-of-the-art methods, while outperforming these methods with respect to Dynamic Accuracy and efficiency. © 2022 Elsevier B.V.
MICFuzzy : a maximal information content based fuzzy approach for reconstructing genetic networks
- Gamage, Hasini, Chetty, Madhu, Lim, Suryani, Hallinan, Jennifer
- Authors: Gamage, Hasini , Chetty, Madhu , Lim, Suryani , Hallinan, Jennifer
- Date: 2023
- Type: Text , Journal article
- Relation: PLoS ONE Vol. 18, no. 7 July (2023), p.
- Full Text:
- Reviewed:
- Description: In systems biology, the accurate reconstruction of Gene Regulatory Networks (GRNs) is crucial since these networks can facilitate the solving of complex biological problems. Amongst the plethora of methods available for GRN reconstruction, information theory and fuzzy concepts-based methods have abiding popularity. However, most of these methods are not only complex, incurring a high computational burden, but they may also produce a high number of false positives, leading to inaccurate inferred networks. In this paper, we propose a novel hybrid fuzzy GRN inference model called MICFuzzy which involves the aggregation of the effects of Maximal Information Coefficient (MIC). This model has an information theory-based pre-processing stage, the output of which is applied as an input to the novel fuzzy model. In this preprocessing stage, the MIC component filters relevant genes for each target gene to significantly reduce the computational burden of the fuzzy model when selecting the regulatory genes from these filtered gene lists. The novel fuzzy model uses the regulatory effect of the identified activator-repressor gene pairs to determine target gene expression levels. This approach facilitates accurate network inference by generating a high number of true regulatory interactions while significantly reducing false regulatory predictions. The performance of MICFuzzy was evaluated using DREAM3 and DREAM4 challenge data, and the SOS real gene expression dataset. MICFuzzy outperformed the other state-of-the-art methods in terms of F-score, Matthews Correlation Coefficient, Structural Accuracy, and SS_mean, and outperformed most of them in terms of efficiency. MICFuzzy also had improved efficiency compared with the classical fuzzy model since the design of MICFuzzy leads to a reduction in combinatorial computation. Copyright: © 2023 Nakulugamuwa Gamage et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- Authors: Gamage, Hasini , Chetty, Madhu , Lim, Suryani , Hallinan, Jennifer
- Date: 2023
- Type: Text , Journal article
- Relation: PLoS ONE Vol. 18, no. 7 July (2023), p.
- Full Text:
- Reviewed:
- Description: In systems biology, the accurate reconstruction of Gene Regulatory Networks (GRNs) is crucial since these networks can facilitate the solving of complex biological problems. Amongst the plethora of methods available for GRN reconstruction, information theory and fuzzy concepts-based methods have abiding popularity. However, most of these methods are not only complex, incurring a high computational burden, but they may also produce a high number of false positives, leading to inaccurate inferred networks. In this paper, we propose a novel hybrid fuzzy GRN inference model called MICFuzzy which involves the aggregation of the effects of Maximal Information Coefficient (MIC). This model has an information theory-based pre-processing stage, the output of which is applied as an input to the novel fuzzy model. In this preprocessing stage, the MIC component filters relevant genes for each target gene to significantly reduce the computational burden of the fuzzy model when selecting the regulatory genes from these filtered gene lists. The novel fuzzy model uses the regulatory effect of the identified activator-repressor gene pairs to determine target gene expression levels. This approach facilitates accurate network inference by generating a high number of true regulatory interactions while significantly reducing false regulatory predictions. The performance of MICFuzzy was evaluated using DREAM3 and DREAM4 challenge data, and the SOS real gene expression dataset. MICFuzzy outperformed the other state-of-the-art methods in terms of F-score, Matthews Correlation Coefficient, Structural Accuracy, and SS_mean, and outperformed most of them in terms of efficiency. MICFuzzy also had improved efficiency compared with the classical fuzzy model since the design of MICFuzzy leads to a reduction in combinatorial computation. Copyright: © 2023 Nakulugamuwa Gamage et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- «
- ‹
- 1
- ›
- »