Predicting the secondary structure of proteins from a primary sequence alone has been variously approached from either a classification or a generative model perspective. The most prominent classification methods have used neural networks, which involves mappings from a local window of residues in the sequence to the structural state of the central residue in the window, thus capturing the local interactions effectively. However, they fail to capture distant interactions among residues. The generative models based on Bayesian segmentation capture sequence structure relationships using generalized hidden Markov models with explicit state duration. They capture non-local interactions through a joint sequence-structure probability distribution based on structural segments. In this paper, we investigate a combined architecture of Bayesian segmentation at the first stage and neural network at the second stage which captures both local and non-local correlation, to increase the single sequence prediction accuracy. The combined architecture is further enhanced by using neural network optimization and ensemble techniques.