Data discretization for dynamic Bayesian network based modeling of genetic networks
- Authors: Nguyen, Vinh , Chetty, Madhu , Coppel, Ross , Wangikar, Pramod
- Date: 2012
- Type: Text , Conference paper
- Relation: Neural Information Processing 19th International Conference p. 298-306
- Full Text: false
- Reviewed:
- Description: Dynamic Bayesian networks (DBN) are widely applied in Systems biology for modeling various biological networks, including gene regulatory networks and metabolic networks. The application of DBN models often requires data discretization. Although various discretization techniques exist, currently there is no consensus on which approach is most suitable. Popular discretization strategies within the bioinformatics community, such as interval and quantile discretization, are likely not optimal. In this paper, we propose a novel approach for data discretization for mutual information based learning of DBN. In this approach, the data are discretized so that the mutual information between parent and child nodes is maximized, subject to a suitable penalty put on the complexity of the discretization. A dynamic programming approach is used to find the optimal discretization threshold for each individual variable. Our approach iteratively learns both the network and the discretization scheme until a locally optimal solution is reached. Tests on real genetic networks confirm the effectiveness of the proposed method.
Rhythmic and sustained oscillations in metabolism and gene expression of Cyanothece sp. ATCC 51142 under constant light
- Authors: Gaudana, Sandeep , Krishnakumar, S. , Alagesan, Swathi , Digmurti, Madhuri , Viswanathan, Ganesh , Chetty, Madhu , Wangikar, Pramod
- Date: 2013
- Type: Text , Journal article
- Relation: Frontiers in Microbiology Vol. 4, no. Article 374 (2013), p. 1-11
- Full Text:
- Reviewed:
- Description: Cyanobacteria, a group of photosynthetic prokaryotes, oscillate between day and night time metabolisms with concomitant oscillations in gene expression in response to light/dark cycles (LD). The oscillations in gene expression have been shown to sustain in constant light (LL) with a free running period of 24 h in a model cyanobacterium Synechococcus elongatus PCC 7942. However, equivalent oscillations in metabolism are not reported under LL in this non-nitrogen fixing cyanobacterium. Here we focus on Cyanothece sp. ATCC 51142, a unicellular, nitrogen-fixing cyanobacterium known to temporally separate the processes of oxygenic photosynthesis and oxygen-sensitive nitrogen fixation. In a recent report, metabolism of Cyanothece 51142 has been shown to oscillate between photosynthetic and respiratory phases under LL with free running periods that are temperature dependent but significantly shorter than the circadian period. Further, the oscillations shift to circadian pattern at moderate cell densities that are concomitant with slower growth rates. Here we take this understanding forward and demonstrate that the ultradian rhythm under LL sustains at much higher cell densities when grown under turbulent regimes that simulate flashing light effect. Our results suggest that the ultradian rhythm in metabolism may be needed to support higher carbon and nitrogen requirements of rapidly growing cells under LL. With a comprehensive Real time PCR based gene expression analysis we account for key regulatory interactions and demonstrate the interplay between clock genes and the genes of key metabolic pathways. Further, we observe that several genes that peak at dusk in Synechococcus peak at dawn in Cyanothece and vice versa. The circadian rhythm of this organism appears to be more robust with peaking of genes in anticipation of the ensuing photosynthetic and respiratory metabolic phases.
GlobalMIT: learning globally optimal dynamic Bayesian network with the mutual information test criterion
- Authors: Nguyen, Vinh , Chetty, Madhu , Coppel, Ross , Wangikar, Pramod
- Date: 2011
- Type: Text , Journal article
- Relation: Bioinformatics Vol. 27, no. 19 (2011), p.2765-2766
- Full Text: false
- Reviewed:
Local and global algorithms for learning dynamic Bayesian networks
- Authors: Nguyen, Vinh , Chetty, Madhu , Coppel, Ross , Wangikar, Pramod
- Date: 2012
- Type: Text , Conference paper
- Relation: The 12th IEEE International Conference on Data Mining (ICDM 2012) p. 685-694
- Full Text: false
- Reviewed:
- Description: Learning optimal Bayesian networks (BN) from data is NP-hard in general. Nevertheless, certain BN classes with additional topological constraints, such as the dynamic BN (DBN) models, widely applied in specific fields such as systems biology, can be efficiently learned in polynomial time. Such algorithms have been developed for the Bayesian-Dirichlet (BD), Minimum Description Length (MDL), and Mutual Information Test (MIT) scoring metrics. The BD-based algorithm admits a large polynomial bound, hence it is impractical for even modestly sized networks. The MDL-and MIT-based algorithms admit much smaller bounds, but require a very restrictive assumption that all variables have the same cardinality, thus significantly limiting their applicability. In this paper, we first propose an improvement to the MDL-and MIT-based algorithms, dropping the equicardinality constraint, thus significantly enhancing their generality. We also explore local Markov blanket based algorithms for constructing BN in the context of DBN, and show an interesting result: under the faithfulness assumption, the mutual information test based local Markov blanket algorithms yield the same network as learned by the global optimization MIT-based algorithm. Experimental validation on small and large scale genetic networks demonstrates the effectiveness of our proposed approaches.
Gene regulatory network modeling via global optimization of high-order dynamic Bayesian network
- Authors: Nguyen, Vinh , Chetty, Madhu , Coppel, Ross , Wangikar, Pramod
- Date: 2012
- Type: Text , Journal article
- Relation: BMC Bioinformatics Vol. 13, no. 131 (2012), p. 1-16
- Full Text:
- Reviewed:
- Description: Abstract Background Dynamic Bayesian network (DBN) is among the mainstream approaches for modeling various biological networks, including the gene regulatory network (GRN). Most current methods for learning DBN employ either local search such as hill-climbing, or a meta stochastic global optimization framework such as genetic algorithm or simulated annealing, which are only able to locate sub-optimal solutions. Further, current DBN applications have essentially been limited to small sized networks. Results To overcome the above difficulties, we introduce here a deterministic global optimization based DBN approach for reverse engineering genetic networks from time course gene expression data. For such DBN models that consist only of inter time slice arcs, we show that there exists a polynomial time algorithm for learning the globally optimal network structure. The proposed approach, named GlobalMIT+, employs the recently proposed information theoretic scoring metric named mutual information test (MIT). GlobalMIT+ is able to learn high-order time delayed genetic interactions, which are common to most biological systems. Evaluation of the approach using both synthetic and real data sets, including a 733 cyanobacterial gene expression data set, shows significantly improved performance over other techniques. Conclusions Our studies demonstrate that deterministic global optimization approaches can infer large scale genetic networks.
A model of the circadian clock in the cyanobacterium Cyanothece sp. ATCC 51142
- Authors: Nguyen, Vinh , Chetty, Madhu , Coppel, Ross , Gaudana, Sandeep , Wangikar, Pramod
- Date: 2013
- Type: Text , Journal article
- Relation: BMC Bioinformatics Vol. 14, no. (Supplement 2) (2013), p. s14-1-s14-9
- Full Text:
- Reviewed:
- Description: Background The over consumption of fossil fuels has led to growing concerns over climate change and global warming. Increasing research activities have been carried out towards alternative viable biofuel sources. Of several different biofuel platforms, cyanobacteria possess great potential, for their ability to accumulate biomass tens of times faster than traditional oilseed crops. The cyanobacterium Cyanothece sp. ATCC 51142 has recently attracted lots of research interest as a model organism for such research. Cyanothece can perform efficiently both photosynthesis and nitrogen fixation within the same cell, and has been recently shown to produce biohydrogen--a byproduct of nitrogen fixation--at very high rates of several folds higher than previously described hydrogen-producing photosynthetic microbes. Since the key enzyme for nitrogen fixation is very sensitive to oxygen produced by photosynthesis, Cyanothece employs a sophisticated temporal separation scheme, where nitrogen fixation occurs at night and photosynthesis at day. At the core of this temporal separation scheme is a robust clocking mechanism, which so far has not been thoroughly studied. Understanding how this circadian clock interacts with and harmonizes global transcription of key cellular processes is one of the keys to realize the inherent potential of this organism. Results In this paper, we employ several state of the art bioinformatics techniques for studying the core circadian clock in Cyanothece sp. ATCC 51142, and its interactions with other key cellular processes. We employ comparative genomics techniques to map the circadian clock genes and genetic interactions from another cyanobacterial species, namely Synechococcus elongatus PCC 7942, of which the circadian clock has been much more thoroughly investigated. Using time series gene expression data for Cyanothece, we employ gene regulatory network reconstruction techniques to learn this network de novo, and compare the reconstructed network against the interactions currently reported in the literature. Next, we build a computational model of the interactions between the core clock and other cellular processes, and show how this model can predict the behaviour of the system under changing environmental conditions. The constructed models significantly advance our understanding of the Cyanothece circadian clock functional mechanisms.
Diurnal rhythm of a unicellular diazotrophic cyanobacterium under mixotrophic conditions and elevated carbon dioxide
- Authors: Gaudana, Sandeep , Alagesan, Swathi , Chetty, Madhu , Wangikar, Pramod
- Date: 2013
- Type: Text , Journal article
- Relation: Photosynthesis Research Vol. 118, no. 1-2 (2013), p. 51-57
- Full Text: false
- Reviewed:
- Description: Mixotrophic cultivation of cyanobacteria in wastewaters with flue gas sparging has the potential to simultaneously sequester carbon content from gaseous and aqueous streams and convert to biomass and biofuels. Therefore, it was of interest to study the effect of mixotrophy and elevated CO2 on metabolism, morphology and rhythm of gene expression under diurnal cycles. We chose a diazotrophic unicellular cyanobacterium Cyanothece sp. ATCC 51142 as a model, which is a known hydrogen producer with robust circadian rhythm. Cyanothece 51142 grows faster with nitrate and/or an additional carbon source in the growth medium and at 3 % CO2. Intracellular glycogen contents undergo diurnal oscillations with greater accumulation under mixotrophy. While glycogen is exhausted by midnight under autotrophic conditions, significant amounts remain unutilized accompanied by a prolonged upregulation of nifH gene under mixotrophy. This possibly supports nitrogen fixation for longer periods thereby leading to better growth. To gain insights into the influence of mixotrophy and elevated CO2 on circadian rhythm, transcription of core clock genes kaiA, kaiB1 and kaiC1, the input pathway, cikA, output pathway, rpaA and representatives of key metabolic pathways was analyzed. Clock genes’ transcripts were lower under mixotrophy suggesting a dampening effect exerted by an external carbon source such as glycerol. Nevertheless, the genes of the clock and important metabolic pathways show diurnal oscillations in expression under mixotrophic and autotrophic growth at ambient and elevated CO2, respectively. Taken together, the results indicate segregation of light and dark associated reactions even under mixotrophy and provide important insights for further applications.
Improving gene regulatory network inference using network topology information
- Authors: Nair, Ajay , Chetty, Madhu , Wangikar, Pramod
- Date: 2015
- Type: Text , Journal article
- Relation: Molecular BioSystems Vol. 11, no. 9 (2015), p. 2449-2463
- Full Text: false
- Reviewed:
- Description: Inferring the gene regulatory network (GRN) structure from data is an important problem in computational biology. However, it is a computationally complex problem and approximate methods such as heuristic search techniques, restriction of the maximum-number-of-parents (maxP) for a gene, or an optimal search under special conditions are required. The limitations of a heuristic search are well known but literature on the detailed analysis of the widely used maxP technique is lacking. The optimal search methods require large computational time. We report the theoretical analysis and experimental results of the strengths and limitations of the maxP technique. Further, using an optimal search method, we combine the strengths of the maxP technique and the known GRN topology to propose two novel algorithms. These algorithms are implemented in a Bayesian network framework and tested on biological, realistic, and in silico networks of different sizes and topologies. They overcome the limitations of the maxP technique and show superior computational speed when compared to the current optimal search algorithms.
Significance of non-edge priors in gene regulatory network reconstruction
- Authors: Nair, Ajay , Chetty, Madhu , Wangikar, Pramod
- Date: 2014
- Type: Text , Conference paper
- Relation: 21st International Conference, ICONIP 2014 Kuching, Malaysia, November 3–6, 2014; published in Neural Information Processing, (Lecture Notes in Computer Science) Vol. 8834 p446-453
- Full Text: false
- Reviewed:
- Description: It is well known that incorporating prior knowledge improves gene regulatory network reconstruction from data. Two types of prior knowledge can be given for the gene regulatory network inference - known interactions (edge priors) and known absence of interactions (non-edge priors). However, previous studies have focused mainly on edge priors. This paper shows that the edge priors give only limited improvement. Moreover, non-edge priors are crucial for better overall performance and their effect dominates edge priors at larger data samples. The studies are carried out on two real networks and a computationally tractable synthetic network, using Bayesian network framework. Further, a method to obtain large numbers of non-edge priors for real gene regulatory networks is presented. © Springer International Publishing Switzerland 2014.
Dynamic Bayesian network modeling of cyanobacterial biological processes via gene clustering
- Authors: Nguyen, Vinh , Chetty, Madhu , Coppel, Ross , Wangikar, Pramod
- Date: 2011
- Type: Text , Conference paper
- Relation: 18th International Conference on Neural Information Processing, ICONIP 2011; Shanghai; China; 13th-17th November 2011; published in (Lecture Notes in Computer Science series) Vol. 7062 (1) pg 97-106
- Full Text: false
- Reviewed:
- Description: Cyanobacteria are photosynthetic organisms that are credited with both the creation and replenishment of the oxygen-rich atmosphere, and are also responsible for more than half of the primary production on earth. Despite their crucial evolutionary and environmental roles, the study of these organisms has lagged behind other model organisms. This paper presents preliminary results on our ongoing research to unravel the biological interactions occurring within cyanobacteria. We develop an analysis framework that leverages recently developed bioinformatics and machine learning tools, such as genome-wide sequence matching based annotation, gene ontology analysis, cluster analysis and dynamic Bayesian network. Together, these tools allow us to overcome the lack of knowledge of less well-studied organisms, and reveal interesting relationships among their biological processes. Experiments on the Cyanothece bacterium demonstrate the practicability and usefulness of our approach. © 2011 Springer-Verlag.
- Description: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011, Vol.7062 (1), pp.97-106
Frequency decomposition based gene clustering
- Authors: Rahman, Md Abdur , Chetty, Madhu , Bulach, Dieter , Wangikar, Pramod
- Date: 2015
- Type: Text , Conference paper
- Relation: 22nd International Conference on Neural Information Processing, ICONIP 2015; Istanbul, Turkey; 9th-12th November 2015 Vol. 9490, p. 170-181
- Full Text: false
- Reviewed:
- Description: Gene expressions have been commonly applied to understand the inherent underlying mechanism of known biological processes. Although the microarray gene expressions usually appear aperiodic, with proper signal processing techniques, its periodic components can be easily obtained. Thus, if expressions of interconnected (regulatory and regulated) genes are decomposed, at least one common frequency component will appear in these genes. Exploiting this novel concept, we propose a frequency decomposition approach for gene clustering to better understand the gene interconnection topology. This method, based on Hilbert Huang Transform (HHT) enables us to segregate every periodic component of the gene expressions. Next, a multilevel clustering is performed based on these frequency components. Unlike existing clustering algorithms, the proposed method assimilates a meaningful knowledge of the gene interactions topology. The information related to underlying gene interactions is vital and can prove useful in many existing evolutionary optimisation algorithms for genetic network reconstruction. We validate the entire approach by its application to a 15-gene synthetic network. © Springer International Publishing Switzerland 2015.
Coupling of cellular processes and their coordinated oscillations under continuous light in Cyanothece sp. ATCC 51142, a diazotrophic unicellular cyanobacterium
- Authors: Krishnakumar, Sujatha , Gaudana, Sandeep , Vinh, Nguyen , Viswanathan, Ganesh , Chetty, Madhu , Wangikar, Pramod
- Date: 2015
- Type: Text , Journal article
- Relation: PLoS ONE Vol. 10, no. 5 (2015), p. 1-23
- Full Text:
- Reviewed:
- Description: Unicellular diazotrophic cyanobacteria such as Cyanothece sp. ATCC 51142 (henceforth Cyanothece), temporally separate the oxygen sensitive nitrogen fixation from oxygen evolving photosynthesis not only under diurnal cycles (LD) but also in continuous light (LL). However, recent reports demonstrate that the oscillations in LL occur with a shorter cycle time of ∼11 h. We find that indeed, majority of the genes oscillate in LL with this cycle time. Genes that are upregulated at a particular time of day under diurnal cycle also get upregulated at an equivalent metabolic phase under LL suggesting tight coupling of various cellular events with each other and with the cell's metabolic status. A number of metabolic processes get upregulated in a coordinated fashion during the respiratory phase under LL including glycogen degradation, glycolysis, oxidative pentose phosphate pathway, and tricarboxylic acid cycle. These precede nitrogen fixation apparently to ensure sufficient energy and anoxic environment needed for the nitrogenase enzyme. Photosynthetic phase sees upregulation of photosystem II, carbonate transport, carbon concentrating mechanism, RuBisCO, glycogen synthesis and light harvesting antenna pigment biosynthesis. In Synechococcus elongates PCC 7942, a non-nitrogen fixing cyanobacteria, expression of a relatively smaller fraction of genes oscillates under LL condition with the major periodicity being 24 h. In contrast, the entire cellular machinery of Cyanothece orchestrates coordinated oscillation in anticipation of the ensuing metabolic phase in both LD and LL. These results may have important implications in understanding the timing of various cellular events and in engineering cyanobacteria for biofuel production. © 2015 Krishnakumar et al.
Issues impacting genetic network reverse engineering algorithm validation using small networks
- Authors: Nguyen, Vinh , Chetty, Madhu , Coppel, Ross , Wangikar, Pramod
- Date: 2012
- Type: Text , Journal article
- Relation: BBA - Proteins and Proteomics Vol. 1824, no. 12 (2012), p. 1434-1441
- Full Text: false
- Reviewed:
- Description: Genetic network reverse engineering has been an area of intensive research within the systems biology community during the last decade. With many techniques currently available, the task of validating them and choosing the best one for a certain problem is a complex issue. Current practice has been to validate an approach on in-silico synthetic data sets, and, wherever possible, on real data sets with known ground-truth. In this study, we highlight a major issue that the validation of reverse engineering algorithms on small benchmark networks very often results in networks which are not statistically better than a randomly picked network. Another important issue highlighted is that with short time series, a small variation in the pre-processing procedure might yield large differences in the inferred networks. To demonstrate these issues, we have selected as our case study the IRMA in-vivo synthetic yeast network recently published in Cell. Using Fisher's exact test, we show that many results reported in the literature on reverse-engineering this network are not significantly better than random. The discussion is further extended to some other networks commonly used for validation purposes in the literature. The results presented in this study emphasize that studies carried out using small genetic networks are likely to be trivial, making it imperative that larger real networks be used for validating and benchmarking purposes. If smaller networks are considered, then the results should be interpreted carefully to avoid over confidence. This article is part of a Special Issue entitled: Computational Methods for Protein Interaction and Structural Prediction.
Influence of mixotrophic growth on rhythmic oscillations in expression of metabolic pathways in diazotrophic cyanobacterium Cyanothece sp ATCC 51142
- Authors: Krishnakumar, Sujatha , Gaudana, Sandeep , Digmurti, Madhuri , Viswanathan, Ganesh , Chetty, Madhu , Wangikar, Pramod
- Date: 2015
- Type: Text , Journal article
- Relation: Bioresource Technology Vol. 188, no. (2015), p. 145-152
- Full Text: false
- Reviewed:
- Description: This study investigates the influence of mixotrophy on physiology and metabolism by analysis of global gene expression in unicellular diazotrophic cyanobacterium Cyanothece sp. ATCC 51142 ( henceforth Cyanothece 51142). It was found that Cyanothece 51142 continues to oscillate between photosynthesis and respiration in continuous light under mixotrophy with cycle time of similar to 13 h. Mixotrophy is marked by an extended respiratory phase compared with photoautotrophy. It can be argued that glycerol provides supplementary energy for nitrogen fixation, which is derived primarily from the glycogen reserves during photoautotrophy. The genes of NDH complex, cytochrome c oxidase and ATP synthase are significantly overexpressed in mixotrophy during the day compared to autotrophy with synchronous expression of the bidirectional hydrogenase genes possibly to maintain redox balance. However, nitrogenase complex remains exclusive to nighttime metabolism concomitantly with uptake hydrogenase. This study throws light on interrelations between metabolic pathways with implications in design of hydrogen producer strains. (C) 2015 Elsevier Ltd. All rights reserved.