Depth sequence coding with hierarchical partitioning and spatial-domain quantization
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 30, no. 3 (2020), p. 835-849
- Full Text:
- Reviewed:
- Description: Depth coding in 3D-HEVC deforms object shapes due to block-level edge-approximation and lacks efficient techniques to exploit the statistical redundancy, due to the frame-level clustering tendency in depth data, for higher coding gain at near-lossless quality. This paper presents a standalone mono-view depth sequence coder, which preserves edges implicitly by limiting quantization to the spatial-domain and exploits the frame-level clustering tendency efficiently with a novel binary tree-based decomposition (BTBD) technique. The BTBD can exploit the statistical redundancy in frame-level syntax, motion components, and residuals efficiently with fewer block-level prediction/coding modes and simpler context modeling for context-adaptive arithmetic coding. Compared with the depth coder in 3D-HEVC, the proposed one has achieved significantly lower bitrate at lossless to near-lossless quality range for mono-view coding and rendered superior quality synthetic views from the depth maps, compressed at the same bitrate, and the corresponding texture frames. © 1991-2012 IEEE.
A novel depth edge prioritization based coding technique to boost-UP HEVC performance
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2016
- Type: Text , Conference paper
- Relation: 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
- Full Text: false
- Reviewed:
- Description: In addition to the texture, multiview video employs the utilization of depth coding for the reconstruction of 3D video and Free viewpoint video. Standing on some texture-depth correlations, a number of methods in literature reuses texture motion vector for the corresponding depth coding to reduce encoding time by avoiding costly motion estimation process. However, texture similarity metric is not always equivalent to the corresponding depth similarity metric especially at edge levels. Since their approaches could not explicitly detect and encode acute edge motions of depth objects, eventually, could not reach the similar or improved rate-distortion (RD) performance against the High Efficiency Video Coding (HEVC) reference test model (HM). With a view to more accurate motion detection and modeling, the proposed technique exploits an extra Pattern Mode comprising a group of pattern templates (GPTs) with different rectangular and non-rectangular object shapes and edges compared to the existing HEVC block partitioning modes. Moreover, the proposed Pattern Mode only encodes the motion areas and skips the background areas. The experimental results show that the proposed technique could save 30% encoding time and improve average 0.1dB Bjontegard Delta peak signal-to-noise ratio (BD-PSNR) compared to the HM.
Foreground motion and spatial saliency-based efficient HEVC Video Coding
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2015
- Type: Text , Conference paper
- Relation: 2015 International Conference on Image and Vision Computing New Zealand (IVCNZ)
- Full Text: false
- Reviewed:
- Description: High Efficiency Video Coding (HEVC) could not provide real time facilities to the limited processing and battery powered electronic devices as its encoding time complexity increases multiple times compared to its predecessor. Numerous researchers contribute to address this limitation by reducing a number of motion estimation (ME) modes where they analyze homogeneity, residual and statistical correlation among different modes. Although their approaches save some encoding time, however, could not reach the similar rate-distortion (RD) performance with HEVC encoder as they merely depend on existing Lagrangian cost function (LCF) within HEVC framework. To overcome this limitation, in this paper, we capture visual attentive Foreground motion and salient region (FMSR) which are sensitive to human visual system for quality assessment. The FMSR features captured by visual attentive and dynamic background modeling are adaptively synthesized to determine a subset of candidate modes. This preprocessing phase is independent from LCF. Since the proposed technique can avoid exhaustive exploration of all modes with simple criteria, it can reduce 27% encoding time on average. With efficient selection of FMSR-based appropriate block partitioning modes, it can also improve up to 1.0dB peak signal-to-noise ratio (PSNR).
Efficient pattern index coding using syndrome coding and side information
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2012
- Type: Text , Journal article
- Relation: International Journal of Engineering and Industries Vol. 3, no. 3 (2012), p. 1-12
- Full Text: false
- Reviewed:
- Description: Pattern-based video coding focusing on moving regions has already established its superiority over the H.264 at very low bit rate. Up to a certain limit, the larger the number of pattern templates, thebetter the approximation to the moving regions. However, beyond that limit no coding gain is observed due to the need of an excessive number of pattern identification bits. Recently, distributed video codingschemes have used syndrome coding to predict the original information in the decoder using side information. In this paper a pattern identification scheme is proposed which predicts the pattern fromthe syndrome codes and side information in the decoder so that the actual pattern identification code is not needed. The experimental results confirm that the new scheme improves the rate-distortionperformance compared to the existing pattern-based video coding and compared with the H.264 standard. The proposed new scheme will also present opportunities for further syndrome codingapplication.
Video coding focusing on block partitioning and occlusion
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2010
- Type: Text , Journal article
- Relation: IEEE Transactions on Image Processing Vol. 19, no. 3 (2010), p. 691-701
- Full Text: false
- Reviewed:
- Description: Among the existing block partitioning schemes, the pattern-based video coding (PVC) has already established its superiority at low bit-rate. Its innovative segmentation process with regular-shaped pattern templates is very fast as it avoids handling the exact shape of the moving objects. It also judiciously encodes the pattern-uncovered background segments capturing high level of interblock temporal redundancy without any motion compensation, which is favoured by the rate-distortion optimizer at low bit-rates. The existing PVC technique, however, uses a number of content-sensitive thresholds and thus setting them to any predefined values risks ignoring some of the macroblocks that would otherwise be encoded with patterns. Furthermore, occluded background can potentially degrade the performance of this technique. In this paper, a robust PVC scheme is proposed by removing all the content-sensitive thresholds, introducing a new similarity metric, considering multiple top-ranked patterns by the rate-distortion optimizer, and refining the Lagrangian multiplier of the H.264 standard for efficient embedding. A novel pattern-based residual encoding approach is also integrated to address the occlusion issue. Once embedded into the H.264 Baseline profile, the proposed PVC scheme improves the image quality perceptually significantly by at least 0.5 dB in low bit-rate video coding applications. A similar trend is observed for moderate to high bit-rate applications when the proposed scheme replaces the bi-directional predictive mode in the H.264 High profile.
A novel pattern identification scheme using distributed video coding concepts
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2009
- Type: Text , Conference paper
- Relation: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2009) p. 729-732
- Full Text: false
- Reviewed:
- Description: Pattern-based video coding focusing on moving region in a macroblock has already established its superiority over recent H.264 video coding standard at very low bit rate. Obviously, a large number of pattern templates approximate the moving regions better however, after a certain limit no coding gain is observed due to the increase number of pattern identification bits. Recently, distributed video coding schemes used syndrome coding to predict the original information in decoder using side information. In this paper a novel pattern identification scheme is proposed which predicts the pattern from the syndrome codes and side information in decoder so that actual pattern identification number is not needed in the bitstream. The experimental results confirm that this new scheme successfully improves the rate-distortion performance compared to the existing pattern-based video coding as well as H.264 standard. This new scheme will also open another window of syndrome coding application.
A hybrid object detection technique from dynamic background using Gaussian mixture models
- Authors: Haque, Mohammad , Murshed, Manzur , Paul, Manoranjan
- Date: 2008
- Type: Text , Conference paper
- Relation: 2008 IEEE 10th Workshop on Multimedia Signal Processing p. 915-920
- Full Text: false
- Reviewed:
- Description: Adaptive background modelling based object detection techniques are widely used in machine vision applications for handling the challenges of real-world multimodal background. But they are constrained to specific environment due to relying on environment specific parameters, and their performances also fluctuate across different operating speeds. On the other side, basic background subtraction (BBS) is not suitable for real applications due to manual background initialization requirement and its inability to handle repetitive multimodal background. However, it shows better stability across different operating speeds and can better eliminate noise, shadow, and trailing effect than adaptive techniques as no model adaptability or environment related parameters are involved. In this paper, we propose a hybrid object detection technique for incorporating the strengths of both approaches. In our technique, Gaussian mixture models (GMM) is used for maintaining an adaptive background model and both probabilistic and basic subtraction decisions are utilized for calculating inexpensive neighbourhood statistics for guiding the final object detection decision. Experimental results with two benchmark datasets and comparative analysis with recent adaptive object detection technique show the strength of the proposed technique in eliminating noise, shadow, and trailing effect while maintaining better stability across variable operating speeds.
Improved Gaussian mixtures for robust object detection by adaptive multi-background generation
- Authors: Haque, Mohammad , Murshed, Manzur , Paul, Manoranjan
- Date: 2008
- Type: Text , Conference paper
- Relation: 19th International Conference on Pattern Recognition p. 1-4
- Full Text: false
- Reviewed:
- Description: Adaptive Gaussian mixtures are widely used to model the dynamic background for real-time object detection. Recently the convergence speed of this approach is improved and a relatively robust statistical framework is proposed by Lee (PAMI, 2005). However, object quality still remains unacceptable due to poor Gaussian mixture quality, susceptibility to background/foreground data proportion, and inability to handle intrinsic background motion. This paper proposes an effective technique to eliminate these drawbacks by modifying the new model induction logic and using intensity difference thresholding to detect objects from one or more believe-to-be backgrounds. Experimental results on two benchmark datasets confirm that the object quality of the proposed technique is superior to that of Leepsilas technique at any model learning rate.
On stable dynamic background generation technique using Gaussian mixture models for robust object detection
- Authors: Haque, Mohammad , Murshed, Manzur , Paul, Manoranjan
- Date: 2008
- Type: Text , Conference paper
- Relation: 2008 IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance p. 41-48
- Full Text: false
- Reviewed:
- Description: Gaussian mixture models (GMM) is used to represent the dynamic background in a surveillance video to detect the moving objects automatically. All the existing GMM based techniques inherently use the proportion by which a pixel is going to observe the background in any operating environment. In this paper we first show that such a proportion not only varies widely across different scenarios but also forbids using very fast learning rate. We then propose a dynamic background generation technique in conjunction with basic background subtraction which detected moving objects with improved stability and superior detection quality on a wide range of operating environments in two sets of benchmark surveillance sequences.
Optimal arbitrary shaped pattern-based video coding
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2008
- Type: Text , Conference paper
- Relation: 2008 IEEE 10th Workshop on Multimedia Signal Processing p. 206-211
- Full Text: false
- Reviewed:
- Description: Very low bit-rate video coding algorithms using content-based generated patterns to segment out moving regions at macroblock level have exhibited good potential for improved coding efficiency when embedded into the H.264 standard as extra mode. This content-based pattern generation (CPG) algorithm provides local optimal result as only one pattern can be optimally generated from a given set of moving regions. But, it failed to provide optimal results for multiple patterns from entire sets. Obviously, a global optimal solution for clustering the set and then generation of multiple patterns enhances the performance farther. But a global optimal solution is not achievable due to the non-polynomial nature of the clustering problem. In this paper, we proposed a near optimal content-based pattern generation (OCPG) algorithm which outperforms the existing approach. Coupling OCPG, generating a set of patterns after clustering the macroblocks into several disjoint sets, with direct pattern selection algorithm by allowing all the macroblocks in multiple pattern modes outperforms the existing pattern-based coding while both embedded into the H.264.