An efficient boolean modelling approach for genetic network inference
- Gamage, Hasini, Chetty, Madhu, Shatte, Adrian, Hallinan, Jennifer
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Adrian , Hallinan, Jennifer
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2021, Virtual, Online, 13-15 October 2021, 2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2021
- Full Text:
- Reviewed:
- Description: The inference of Gene Regulatory Networks (GRNs) from time series gene expression data is an effective approach for unveiling important underlying gene-gene relationships and dynamics. While various computational models exist for accurate inference of GRNs, many are computationally inefficient, and do not focus on simultaneous inference of both network topology and dynamics. In this paper, we introduce a simple, Boolean network model-based solution for efficient inference of GRNs. First, the microarray expression data are discretized using the average gene expression value as a threshold. This step permits an experimental approach of defining the maximum indegree of a network. Next, regulatory genes, including the self-regulations for each target gene, are inferred using estimated multivariate mutual information-based Min-Redundancy Max-Relevance Criterion, and further accurate inference is performed by a swapping operation. Subsequently, we introduce a new method, combining Boolean network regulation modelling and Pearson correlation coefficient to identify the interaction types (inhibition or activation) of the regulatory genes. This method is utilized for the efficient determination of the optimal regulatory rule, consisting AND, OR, and NOT operators, by defining the accurate application of the NOT operation in conjunction and disjunction Boolean functions. The proposed approach is evaluated using two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network. Although the Structural Accuracy is approximately the same as existing methods (MIBNI, REVEAL, Best-Fit, BIBN, and CST), the proposed method outperforms all these methods with respect to efficiency and Dynamic Accuracy. © 2021 IEEE.
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Adrian , Hallinan, Jennifer
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2021, Virtual, Online, 13-15 October 2021, 2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2021
- Full Text:
- Reviewed:
- Description: The inference of Gene Regulatory Networks (GRNs) from time series gene expression data is an effective approach for unveiling important underlying gene-gene relationships and dynamics. While various computational models exist for accurate inference of GRNs, many are computationally inefficient, and do not focus on simultaneous inference of both network topology and dynamics. In this paper, we introduce a simple, Boolean network model-based solution for efficient inference of GRNs. First, the microarray expression data are discretized using the average gene expression value as a threshold. This step permits an experimental approach of defining the maximum indegree of a network. Next, regulatory genes, including the self-regulations for each target gene, are inferred using estimated multivariate mutual information-based Min-Redundancy Max-Relevance Criterion, and further accurate inference is performed by a swapping operation. Subsequently, we introduce a new method, combining Boolean network regulation modelling and Pearson correlation coefficient to identify the interaction types (inhibition or activation) of the regulatory genes. This method is utilized for the efficient determination of the optimal regulatory rule, consisting AND, OR, and NOT operators, by defining the accurate application of the NOT operation in conjunction and disjunction Boolean functions. The proposed approach is evaluated using two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network. Although the Structural Accuracy is approximately the same as existing methods (MIBNI, REVEAL, Best-Fit, BIBN, and CST), the proposed method outperforms all these methods with respect to efficiency and Dynamic Accuracy. © 2021 IEEE.
A robust ensemble regression model for reconstructing genetic networks
- Gamage, Hasini, Chetty, Madhu, Lim, Suryani, Hallinan, Jennifer, Nguyen, Huy
- Authors: Gamage, Hasini , Chetty, Madhu , Lim, Suryani , Hallinan, Jennifer , Nguyen, Huy
- Date: 2023
- Type: Text , Conference paper
- Relation: 2023 International Joint Conference on Neural Networks, IJCNN 2023 Vol. 2023-June
- Full Text: false
- Reviewed:
- Description: Genetic networks contain important information about biological processes, including regulatory relationships and gene-gene interactions. Numerous methods, using high-dimensional gene expression data have been developed to capture these interactions. These gene expression data, generated using high-throughput technologies, are prone to noise. However, most existing network inference methods are unable to cope with noisy data, making genetic network reconstruction challenging. In this paper, we propose a novel ensemble regression model combining quantile regression and cross-validated Ridge regression, RidgeCV, to infer interactions from noisy gene expression data. The application of quantile regression to GRN inference is novel, and its design makes it appropriate for noisy data. RidgeCV also addresses other important issues, such as data overfitting and multicollinearity. First, each regression method is independently applied to gene expression data and the output of these methods, in the form of ranked gene lists, is aggregated using a novel gene score-based method by considering the gene rank and model importance. The model importance score is evaluated based on an adjusted coefficient of determination. This method implicitly includes majority voting by averaging each gene score value across all models. The proposed model was tested on the DREAM4 datasets and publicly available small-scale real-world network datasets. Experiments with noisy datasets showed that the proposed ensemble model is more accurate and efficient than other state-of-the-art methods. © 2023 IEEE.
Ensemble Approaches for Robust Reconstruction of Gene Regulatory Networks
- Authors: Gamage, Hasini
- Date: 2024
- Type: Text , Thesis , PhD
- Full Text:
- Description: Gene regulatory networks (GRNs) are intricate control systems governing gene expression dynamics, playing a pivotal role in biological processes. The ongoing development of high-throughput microarray and sequencing technologies has greatly facilitated the acquisition of gene expression data, promoting an extensive body of research focused on unravelling the intricacies of GRNs. This endeavour involves deciphering how genes regulate each other, vital for understanding the molecular functions of cells and diseases, and for designing targeted therapies. However, GRN reconstruction is a formidable task due to the high dimensionality, limited sample size, and the presence of noise in gene expression data. Various reverse engineering approaches have been developed to grapple with these challenges. Each method exhibits certain method-specific issues. A recent trend in the field is the emergence of ensemble methods designed to yield robust reconstructions. This thesis presents a comprehensive exploration of GRN inference through the development and application of ensemble methods, offering both theoretical insights and practical tools. This research commenced with a thorough examination of the challenges involved in the reconstruction of GRNs through the utilization of existing individual inference methods, along with an assessment of their inherent limitations. While there is a plethora of GRN modelling approaches available, we focussed on three distinct modelling approaches: Boolean, regression, and information theory-based fuzzy methods. The selection of these methods was underpinned by established categorizations found in the existing literature and substantial empirical evidence of their remarkable performance in GRN inference. Each of these methods is conceptually different, offering diverse vantage points on the inference challenge. One of the primary objectives of this study was to apply ensemble techniques to each selected individual modelling approach, thereby enhancing the inner method diversity to further enhance the individual method performance. The first modelling approach involves a Boolean network. The initial version employed a simple Boolean network and yielded near-optimal results, while the second version enhanced accuracy by incorporating feature selection methods. The second modelling approach treated GRN inference as a regression problem and involved two variations. The first variation combines cross-validated Lasso (LassoCV) and cross-validated Ridge (RidgeCV), while the second variation addressed noisy gene expression data by combining quantile regression (QR) and RidgeCV for gene selection. The third modelling approach was a novel hybrid fuzzy method, a combination of information theory-based pre-processing stage and a subsequent fuzzy method, known as MICFuzzy, which brought substantial performance improvements. Our experiments showed that each of these individual methods produced performance improvements over state-of-the-art methods with regard to both accuracy and efficiency, by increasing the robustness of GRN reconstruction. In conjunction with the enhanced performance of each of these modelling approaches, our ultimate objective was the development of a novel, unique ensemble framework for robust GRN inference, obtained by combining the outcomes of the different modelling techniques. Thus, we developed a novel ensemble framework called GRAMP: A Gene Ranking And Model Prioritisation framework, to aggregate the inferred networks produced by the aforementioned modelling approaches. This framework addresses the need for a reliable approach for ensemble modelling in GRN inference, aggregating the outcome of diverse modelling approaches. GRAMP includes a novel network aggregation method based on gene scores. Gene scores are evaluated based on the performance of each inference method in a specific problem context and both local and global gene ranking. Experimental results using both simulated and real-world gene expression datasets confirmed the superior performance of this ensemble framework in inferring gene regulatory networks. To further enhance the practical application of these ensemble approaches we introduced a user-friendly desktop application that implements the GRAMP framework, allowing researchers to integrate multiple inference methods and datasets from various problem contexts. This tool fills a critical gap in the availability of an interactive software tool for ensemble model building. This application is freely accessible to the research community. This thesis demonstrates how the research offers a systematic exploration of diverse modelling techniques and the success of our ensemble approaches for GRN inference in the form of our published work. These contributions collectively pave the way for producing robust GRN inference in systems biology, with broad-reaching implications for biology and medicine.
- Description: Doctor of Philosophy
- Authors: Gamage, Hasini
- Date: 2024
- Type: Text , Thesis , PhD
- Full Text:
- Description: Gene regulatory networks (GRNs) are intricate control systems governing gene expression dynamics, playing a pivotal role in biological processes. The ongoing development of high-throughput microarray and sequencing technologies has greatly facilitated the acquisition of gene expression data, promoting an extensive body of research focused on unravelling the intricacies of GRNs. This endeavour involves deciphering how genes regulate each other, vital for understanding the molecular functions of cells and diseases, and for designing targeted therapies. However, GRN reconstruction is a formidable task due to the high dimensionality, limited sample size, and the presence of noise in gene expression data. Various reverse engineering approaches have been developed to grapple with these challenges. Each method exhibits certain method-specific issues. A recent trend in the field is the emergence of ensemble methods designed to yield robust reconstructions. This thesis presents a comprehensive exploration of GRN inference through the development and application of ensemble methods, offering both theoretical insights and practical tools. This research commenced with a thorough examination of the challenges involved in the reconstruction of GRNs through the utilization of existing individual inference methods, along with an assessment of their inherent limitations. While there is a plethora of GRN modelling approaches available, we focussed on three distinct modelling approaches: Boolean, regression, and information theory-based fuzzy methods. The selection of these methods was underpinned by established categorizations found in the existing literature and substantial empirical evidence of their remarkable performance in GRN inference. Each of these methods is conceptually different, offering diverse vantage points on the inference challenge. One of the primary objectives of this study was to apply ensemble techniques to each selected individual modelling approach, thereby enhancing the inner method diversity to further enhance the individual method performance. The first modelling approach involves a Boolean network. The initial version employed a simple Boolean network and yielded near-optimal results, while the second version enhanced accuracy by incorporating feature selection methods. The second modelling approach treated GRN inference as a regression problem and involved two variations. The first variation combines cross-validated Lasso (LassoCV) and cross-validated Ridge (RidgeCV), while the second variation addressed noisy gene expression data by combining quantile regression (QR) and RidgeCV for gene selection. The third modelling approach was a novel hybrid fuzzy method, a combination of information theory-based pre-processing stage and a subsequent fuzzy method, known as MICFuzzy, which brought substantial performance improvements. Our experiments showed that each of these individual methods produced performance improvements over state-of-the-art methods with regard to both accuracy and efficiency, by increasing the robustness of GRN reconstruction. In conjunction with the enhanced performance of each of these modelling approaches, our ultimate objective was the development of a novel, unique ensemble framework for robust GRN inference, obtained by combining the outcomes of the different modelling techniques. Thus, we developed a novel ensemble framework called GRAMP: A Gene Ranking And Model Prioritisation framework, to aggregate the inferred networks produced by the aforementioned modelling approaches. This framework addresses the need for a reliable approach for ensemble modelling in GRN inference, aggregating the outcome of diverse modelling approaches. GRAMP includes a novel network aggregation method based on gene scores. Gene scores are evaluated based on the performance of each inference method in a specific problem context and both local and global gene ranking. Experimental results using both simulated and real-world gene expression datasets confirmed the superior performance of this ensemble framework in inferring gene regulatory networks. To further enhance the practical application of these ensemble approaches we introduced a user-friendly desktop application that implements the GRAMP framework, allowing researchers to integrate multiple inference methods and datasets from various problem contexts. This tool fills a critical gap in the availability of an interactive software tool for ensemble model building. This application is freely accessible to the research community. This thesis demonstrates how the research offers a systematic exploration of diverse modelling techniques and the success of our ensemble approaches for GRN inference in the form of our published work. These contributions collectively pave the way for producing robust GRN inference in systems biology, with broad-reaching implications for biology and medicine.
- Description: Doctor of Philosophy
Ensemble regression modelling for genetic network inference
- Gamage, Hasini, Chetty, Madhu, Shatte, Adrian, Hallinan, Jennifer
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Adrian , Hallinan, Jennifer
- Date: 2022
- Type: Text , Conference paper
- Relation: 2022 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2022, Ottawa Canada, 15-17 August 2022, 2022 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2022
- Full Text:
- Reviewed:
- Description: An accurate reconstruction of Gene Regulatory Networks (GRNs) from time series gene expression data is crucial for discovering complex biological interactions. Among many different approaches for inferring GRNs, there are several methods which produce high false positive interactions, and are unstable, requiring fine tuning for many of their parameters. In this paper, we consider the GRN inference problem as a regression problem, and propose a simple ensemble regression-based feature selection model which is a combination of cross-validated Lasso and cross-validated Ridge algorithms for reconstructing GRNs. Due to the novelty of the proposed ensemble model, it is able to eliminate overfitting, multi co-linearity issues, and irrelevant genes within one computational approach. While observing the type of gene-gene regulatory interactions the regression model also identifies the direction of these interactions. A new coefficient of determination (R2)-based approach identifies the best model to fit the data among LassoCV and RidgeCV, and evaluates the model importance in term of gene-wise maximum in-degree which decides the maximum number of regulatory genes including self-regulations that can be selected from a given method. Then, an evaluated gene score-based majority voting technique aggregates the selected gene lists from each method. In our experiments, the performance of the proposed ensemble approach was evaluated using gene expression datasets from three small-scale real gene networks. Our proposed model outperformed other state-of-the-art methods, producing high true positives, reducing false positives, and obtaining high Structural Accuracy, while maintaining model stability and efficiency. © 2022 IEEE.
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Adrian , Hallinan, Jennifer
- Date: 2022
- Type: Text , Conference paper
- Relation: 2022 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2022, Ottawa Canada, 15-17 August 2022, 2022 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2022
- Full Text:
- Reviewed:
- Description: An accurate reconstruction of Gene Regulatory Networks (GRNs) from time series gene expression data is crucial for discovering complex biological interactions. Among many different approaches for inferring GRNs, there are several methods which produce high false positive interactions, and are unstable, requiring fine tuning for many of their parameters. In this paper, we consider the GRN inference problem as a regression problem, and propose a simple ensemble regression-based feature selection model which is a combination of cross-validated Lasso and cross-validated Ridge algorithms for reconstructing GRNs. Due to the novelty of the proposed ensemble model, it is able to eliminate overfitting, multi co-linearity issues, and irrelevant genes within one computational approach. While observing the type of gene-gene regulatory interactions the regression model also identifies the direction of these interactions. A new coefficient of determination (R2)-based approach identifies the best model to fit the data among LassoCV and RidgeCV, and evaluates the model importance in term of gene-wise maximum in-degree which decides the maximum number of regulatory genes including self-regulations that can be selected from a given method. Then, an evaluated gene score-based majority voting technique aggregates the selected gene lists from each method. In our experiments, the performance of the proposed ensemble approach was evaluated using gene expression datasets from three small-scale real gene networks. Our proposed model outperformed other state-of-the-art methods, producing high true positives, reducing false positives, and obtaining high Structural Accuracy, while maintaining model stability and efficiency. © 2022 IEEE.
GRAMP : a gene ranking and model prioritisation framework for building consensus genetic networks
- Gamage, Hasini, Chetty, Madhu, Lim, Suryani, Hallinan, Jennifer
- Authors: Gamage, Hasini , Chetty, Madhu , Lim, Suryani , Hallinan, Jennifer
- Date: 2024
- Type: Text , Journal article
- Relation: Knowledge-Based Systems Vol. 302, no. (2024), p.
- Full Text:
- Reviewed:
- Description: Despite significant recent advancements in computational and statistical methods, different methods have specific strengths and weaknesses in the accurate reconstruction of gene regulatory networks (GRNs), making it difficult to determine the best method for each specific problem. To overcome these challenges, ensemble approaches, which combine the strengths of individual inference methods, are valuable. However, existing ensemble methods for GRN inference lack a sophisticated network aggregation method and generally rely solely on ranking approaches. These ensemble methods have no reliable mechanisms to identify highly performing inference methods specific to a given problem. They therefore tend to aggregate weak methods, diminishing the overall accuracy of the approach. Thus, developing a reliable mechanism to identify the most effective methods for specific problems and prioritize them in consensus network building is important. This paper presents a novel ensemble approach for reconstructing GRNs by integrating previously developed diverse GRN inference approaches. A novel network aggregation method called GRAMP, Gene Ranking And Model Prioritisation framework was developed, taking into consideration both local and global gene ranking and the performance of different inference approaches on a specific network. The proposed ensemble approach demonstrated performance superior to those of other state-of-the-art methods, as evidenced by results from simulated datasets and a real-world gene expression dataset. © 2024 The Author(s)
- Authors: Gamage, Hasini , Chetty, Madhu , Lim, Suryani , Hallinan, Jennifer
- Date: 2024
- Type: Text , Journal article
- Relation: Knowledge-Based Systems Vol. 302, no. (2024), p.
- Full Text:
- Reviewed:
- Description: Despite significant recent advancements in computational and statistical methods, different methods have specific strengths and weaknesses in the accurate reconstruction of gene regulatory networks (GRNs), making it difficult to determine the best method for each specific problem. To overcome these challenges, ensemble approaches, which combine the strengths of individual inference methods, are valuable. However, existing ensemble methods for GRN inference lack a sophisticated network aggregation method and generally rely solely on ranking approaches. These ensemble methods have no reliable mechanisms to identify highly performing inference methods specific to a given problem. They therefore tend to aggregate weak methods, diminishing the overall accuracy of the approach. Thus, developing a reliable mechanism to identify the most effective methods for specific problems and prioritize them in consensus network building is important. This paper presents a novel ensemble approach for reconstructing GRNs by integrating previously developed diverse GRN inference approaches. A novel network aggregation method called GRAMP, Gene Ranking And Model Prioritisation framework was developed, taking into consideration both local and global gene ranking and the performance of different inference approaches on a specific network. The proposed ensemble approach demonstrated performance superior to those of other state-of-the-art methods, as evidenced by results from simulated datasets and a real-world gene expression dataset. © 2024 The Author(s)
MICFuzzy : a maximal information content based fuzzy approach for reconstructing genetic networks
- Gamage, Hasini, Chetty, Madhu, Lim, Suryani, Hallinan, Jennifer
- Authors: Gamage, Hasini , Chetty, Madhu , Lim, Suryani , Hallinan, Jennifer
- Date: 2023
- Type: Text , Journal article
- Relation: PLoS ONE Vol. 18, no. 7 July (2023), p.
- Full Text:
- Reviewed:
- Description: In systems biology, the accurate reconstruction of Gene Regulatory Networks (GRNs) is crucial since these networks can facilitate the solving of complex biological problems. Amongst the plethora of methods available for GRN reconstruction, information theory and fuzzy concepts-based methods have abiding popularity. However, most of these methods are not only complex, incurring a high computational burden, but they may also produce a high number of false positives, leading to inaccurate inferred networks. In this paper, we propose a novel hybrid fuzzy GRN inference model called MICFuzzy which involves the aggregation of the effects of Maximal Information Coefficient (MIC). This model has an information theory-based pre-processing stage, the output of which is applied as an input to the novel fuzzy model. In this preprocessing stage, the MIC component filters relevant genes for each target gene to significantly reduce the computational burden of the fuzzy model when selecting the regulatory genes from these filtered gene lists. The novel fuzzy model uses the regulatory effect of the identified activator-repressor gene pairs to determine target gene expression levels. This approach facilitates accurate network inference by generating a high number of true regulatory interactions while significantly reducing false regulatory predictions. The performance of MICFuzzy was evaluated using DREAM3 and DREAM4 challenge data, and the SOS real gene expression dataset. MICFuzzy outperformed the other state-of-the-art methods in terms of F-score, Matthews Correlation Coefficient, Structural Accuracy, and SS_mean, and outperformed most of them in terms of efficiency. MICFuzzy also had improved efficiency compared with the classical fuzzy model since the design of MICFuzzy leads to a reduction in combinatorial computation. Copyright: © 2023 Nakulugamuwa Gamage et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- Authors: Gamage, Hasini , Chetty, Madhu , Lim, Suryani , Hallinan, Jennifer
- Date: 2023
- Type: Text , Journal article
- Relation: PLoS ONE Vol. 18, no. 7 July (2023), p.
- Full Text:
- Reviewed:
- Description: In systems biology, the accurate reconstruction of Gene Regulatory Networks (GRNs) is crucial since these networks can facilitate the solving of complex biological problems. Amongst the plethora of methods available for GRN reconstruction, information theory and fuzzy concepts-based methods have abiding popularity. However, most of these methods are not only complex, incurring a high computational burden, but they may also produce a high number of false positives, leading to inaccurate inferred networks. In this paper, we propose a novel hybrid fuzzy GRN inference model called MICFuzzy which involves the aggregation of the effects of Maximal Information Coefficient (MIC). This model has an information theory-based pre-processing stage, the output of which is applied as an input to the novel fuzzy model. In this preprocessing stage, the MIC component filters relevant genes for each target gene to significantly reduce the computational burden of the fuzzy model when selecting the regulatory genes from these filtered gene lists. The novel fuzzy model uses the regulatory effect of the identified activator-repressor gene pairs to determine target gene expression levels. This approach facilitates accurate network inference by generating a high number of true regulatory interactions while significantly reducing false regulatory predictions. The performance of MICFuzzy was evaluated using DREAM3 and DREAM4 challenge data, and the SOS real gene expression dataset. MICFuzzy outperformed the other state-of-the-art methods in terms of F-score, Matthews Correlation Coefficient, Structural Accuracy, and SS_mean, and outperformed most of them in terms of efficiency. MICFuzzy also had improved efficiency compared with the classical fuzzy model since the design of MICFuzzy leads to a reduction in combinatorial computation. Copyright: © 2023 Nakulugamuwa Gamage et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Filter feature selection based boolean modelling for genetic network inference
- Gamage, Hasini, Chetty, Madhu, Shatte, Adrian, Hallinan, Jennifer
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Adrian , Hallinan, Jennifer
- Date: 2022
- Type: Text , Journal article
- Relation: BioSystems Vol. 221, no. (2022), p.
- Full Text:
- Reviewed:
- Description: The reconstruction of Gene Regulatory Networks (GRNs) from time series gene expression data is highly relevant for the discovery of complex biological interactions and dynamics. Various computational strategies have been developed for this task, but most approaches have low computational efficiency and are not able to cope with high-dimensional, low sample-number, gene expression data. In this paper, we introduce a novel combined filter feature selection approach for efficient and accurate inference of GRNs. A Boolean framework for network modelling is used to demonstrate the efficacy of the proposed approach. Using discretized microarray expression data, the genes most relevant to each target gene are first filtered using ReliefF, an instance-based feature ranking method that is here applied for the first time to GRN inference. Then, further gene selection from the filtered-gene list is done using a mutual information-based min-redundancy max-relevance criterion by eliminating irrelevant genes. This combined method is executed on resampled datasets to finalize the optimal set of regulatory genes. Building upon our previous research, a Pearson correlation coefficient-based Boolean modelling approach is utilized for the efficient identification of the optimal regulatory rules associated with selected regulatory genes. The proposed approach was evaluated using gene expression datasets from small-scale and medium-scale real gene networks, and was observed to be more effective than Linear Discriminant Analysis, performed better than the individual feature selection methods, and obtained improved Structural Accuracy with a higher number of true positives than other state-of-the-art methods, while outperforming these methods with respect to Dynamic Accuracy and efficiency. © 2022 Elsevier B.V.
- Authors: Gamage, Hasini , Chetty, Madhu , Shatte, Adrian , Hallinan, Jennifer
- Date: 2022
- Type: Text , Journal article
- Relation: BioSystems Vol. 221, no. (2022), p.
- Full Text:
- Reviewed:
- Description: The reconstruction of Gene Regulatory Networks (GRNs) from time series gene expression data is highly relevant for the discovery of complex biological interactions and dynamics. Various computational strategies have been developed for this task, but most approaches have low computational efficiency and are not able to cope with high-dimensional, low sample-number, gene expression data. In this paper, we introduce a novel combined filter feature selection approach for efficient and accurate inference of GRNs. A Boolean framework for network modelling is used to demonstrate the efficacy of the proposed approach. Using discretized microarray expression data, the genes most relevant to each target gene are first filtered using ReliefF, an instance-based feature ranking method that is here applied for the first time to GRN inference. Then, further gene selection from the filtered-gene list is done using a mutual information-based min-redundancy max-relevance criterion by eliminating irrelevant genes. This combined method is executed on resampled datasets to finalize the optimal set of regulatory genes. Building upon our previous research, a Pearson correlation coefficient-based Boolean modelling approach is utilized for the efficient identification of the optimal regulatory rules associated with selected regulatory genes. The proposed approach was evaluated using gene expression datasets from small-scale and medium-scale real gene networks, and was observed to be more effective than Linear Discriminant Analysis, performed better than the individual feature selection methods, and obtained improved Structural Accuracy with a higher number of true positives than other state-of-the-art methods, while outperforming these methods with respect to Dynamic Accuracy and efficiency. © 2022 Elsevier B.V.
- «
- ‹
- 1
- ›
- »