A coarse representation of frames oriented video coding by leveraging cuboidal partitioning of image data
- Authors: Ahmmed, Ashe , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2020
- Type: Text , Conference paper
- Relation: 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020, Virtual Tampere, Finland 21-24 September 2020
- Full Text:
- Reviewed:
- Description: Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. In this work, we form a coarse representation of the current frame by minimizing commonality within that frame while preserving important structural properties of the frame. The building blocks of this coarse representation are rectangular regions called cuboids, which are computationally simple and has a compact description. Then we propose to employ the coarse frame as an additional source for predictive coding of the current frame. Experimental results show an improvement in bit rate savings over a reference codec for HEVC, with minor increase in the codec computational complexity. © 2020 IEEE.
A commonality modeling framework for enhanced video coding leveraging on the cuboidal partitioning based representation of frames
- Authors: Ahmmed, Ashek , Murshed, Manzur , Paul, Manoranjan , Taubman, David
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia Vol. 24, no. (2022), p. 4446-4457
- Full Text: false
- Reviewed:
- Description: Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. Modern video coding systems are block-based wherein commonality modeling is carried out only from the perspective of the block that need be coded next. In this work, we argue for a commonality modeling approach that can provide a seamless blending between global and local homogeneity information. For this purpose, at first the frame that need be coded, is recursively partitioned into rectangular regions based on the homogeneity information of the entire frame. After that each obtained rectangular region's feature descriptor is taken to be the average value of all the pixels' intensities encompassing the region. In this way, the proposed approach generates a coarse representation of the current frame by minimizing both global and local commonality. This coarse frame is computationally simple and has a compact representation. It attempts to preserve important structural properties of the current frame which can be viewed subjectively as well as from improved rate-distortion performance of a reference scalable HEVC coder that employs the coarse frame as a reference frame for encoding the current frame. © 1999-2012 IEEE.
A novel pattern identification scheme using distributed video coding concepts
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2009
- Type: Text , Conference paper
- Relation: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2009) p. 729-732
- Full Text: false
- Reviewed:
- Description: Pattern-based video coding focusing on moving region in a macroblock has already established its superiority over recent H.264 video coding standard at very low bit rate. Obviously, a large number of pattern templates approximate the moving regions better however, after a certain limit no coding gain is observed due to the increase number of pattern identification bits. Recently, distributed video coding schemes used syndrome coding to predict the original information in decoder using side information. In this paper a novel pattern identification scheme is proposed which predicts the pattern from the syndrome codes and side information in decoder so that actual pattern identification number is not needed in the bitstream. The experimental results confirm that this new scheme successfully improves the rate-distortion performance compared to the existing pattern-based video coding as well as H.264 standard. This new scheme will also open another window of syndrome coding application.
Disparity-adjusted 3D multi-view video coding with dynamic background modelling
- Authors: Paul, Manoranjan , Evans, Christopher , Murshed, Manzur
- Date: 2013
- Type: Text , Conference paper
- Relation: Proceedings of IEEE International Conference on Image Processing (ICIP 2013). 15th-18th Sept, Melbourne, Vic. p.1719-1723
- Full Text: false
- Reviewed:
- Description: Capturing a scene using multiple cameras from different angles is expected to provide the necessary interactivity in the 3D space to satisfy end-users' demands for observing objects and actions from different angles and depths. Existing multiview video coding (MVC) technologies are not sufficiently agile to exploit the interactivity and inefficient in terms of image quality and computational time. In this paper a novel technique is proposed using disparity-adjusted 3D MVC (DA-3D-MVC) with 3D motion estimation (ME) and 3D coding to overcome the problems. In the proposed scheme, a 3D frame is formed using the same temporal frames of all disparity-adjusted views and ME is carried out for the current 3D macroblock using the immediate previous 3D frame as a reference frame. Then, 3D coding technique is used for better compression. As all the same temporal position frames of all views are encoded at the same time, the proposed scheme provides better interactivity and reduced computational time compared to the H.264/MVC. To improve the rate-distortion (RD) performance of the proposed technique, an additional reference frame comprising dynamic background is also used. Experimental results reveal that the proposed scheme outperforms the H.264/MVC in terms of RD performance, computational time, and interactivity.
Efficient coding strategy for HEVC performance improvement by exploiting motion features
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2015
- Type: Text , Conference paper
- Relation: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing, Brisbane, QLD, 19th-24th April, 2015 p. 1414-1418
- Full Text: false
- Reviewed:
- Description: The striking feature of High Efficiency Video Coding (HEVC) Standard is emphasized by 50% bit-rate reduction compared to its predecessor H.264/AVC while keeping the same perceptual image quality. The time complexity - a congenital issue of HEVC has also increased to intensify the compression ratio. However, it is really a demanding task for the researchers to reduce the encoding time while preserving expected quality of the video sequences. Our contribution is to trim down the computational time by efficient selection of appropriate block-partitioning modes in HEVC using motion features based on phase-correlation. In this paper, we use phase-correlation between current and reference blocks to extract three motion features and combine them to determine binary motion pattern of the current block. The motion pattern is then matched against a codebook of predefined pattern templates to determine a subset of the inter-modes. Only the selected modes are exhaustively motion estimated and compensated for a coding unit. The experimental outcomes demonstrate that the average computational time can be down scaled by 30% of the HEVC while providing improved rate-distortion performance.
Exploiting spatial smoothness to recover undecoded coefficients for transform domain distributed video coding
- Authors: Ali, Mortuza , Murshed, Manzur
- Date: 2013
- Type: Text , Conference paper
- Relation: IEEE International Conference on Image Processing; Melbourne, Australia; 15th-18th September 2013, p. 1782-1786
- Relation: http://purl.org/au-research/grants/arc/DP1095487
- Full Text: false
- Reviewed:
- Description: In a transform domain distributed video coding scheme, the correlation between the current encoding unit, e.g. block and slice, and the corresponding side-information is modeled using a virtual channel. This correlation model is then used for rate allocation, quantization, and Wyner-Ziv coding. Since the encoder can only have an estimate of the correlation instead of the exact knowledge of the side-information, the decoder will fail to recover the quantized transformed coeffi- cients with a nonzero probability. In this paper, we propose to integrate a scheme at the decoder to recover the undecoded coefficients using the spatial smoothness property of individual video frames. Simulation results demonstrated that, at different decoding failure probabilities, a transformed coeffi- cient recovery scheme can significantly improve the quality of videos in terms of both PSNR and SSIM.
- Description: In a transform domain distributed video coding scheme, the correlation between the current encoding unit, e.g. block and slice, and the corresponding side-information is modeled using a virtual channel. This correlation model is then used for rate allocation, quantization, and Wyner-Ziv coding. Since the encoder can only have an estimate of the correlation instead of the exact knowledge of the side-information, the decoder will fail to recover the quantized transformed coeffi- cients with a nonzero probability. In this paper, we propose to integrate a scheme at the decoder to recover the undecoded coefficients using the spatial smoothness property of individual video frames. Simulation results demonstrated that, at different decoding failure probabilities, a transformed coeffi- cient recovery scheme can significantly improve the quality of videos in terms of both PSNR and SSIM
Joint texture and depth coding using cuboid data compression
- Authors: Paul, Manoranjan , Chakraborty, Subrata , Murshed, Manzur , Podder, Pallab
- Date: 2015
- Type: Text , Conference proceedings
- Relation: 2015 18th International Conference on Computer and Information Technology (ICCIT); Dhaka, Bangladesh; 21st-23rd December 2015 p. 138-143
- Full Text:
- Reviewed:
- Description: The latest multiview video coding (MVC) standards such as 3D-HEVC and H.264/MVC normally encodes texture and depth videos separately. Significant amount of rate-distortion performance and computational performance are sacrificed due to separate encoding due to the lack of exploitation of joint information. Obviously, separate encoding also creates synchronization issue for 3D scene formation in the decoder. Moreover, the hierarchical frame referencing architecture in the MVC creates random access frame delay. In this paper we develop an encoder and decoder framework where we can encode texture and depth video jointly by forming and encoding 3D cuboid using high dimensional entropy coding. The results from our experiments show that our proposed framework outperforms the 3D-HEVC in rate-distortion performance and reduces the computational time significantly by reducing random access frame delay.
Performance scalable motion estimation for video coding : An overview of current status and a promising approach
- Authors: Sorwar, Golam , Murshed, Manzur
- Date: 2013
- Type: Text , Book chapter
- Relation: Multimedia networking and coding Chapter 3 p. 50-75
- Full Text: false
- Reviewed:
- Description: Motion estimation is one of the major bottlenecks in real-time performance scalable video coding applications due to high computational complexity of exhaustive search. To address this, researchers so far focused on low-complexity motion estimation and rate-distortion optimization in isolation. Proliferation of power-constrained handheld devices with image capturing capability has created demand for much smarter approach where motion estimation is integrated with rate control such that rate-distortion-complexity optimization can be effectively achieved. It is indeed crucial to provide such performance scalability in motion estimation to facilitate complexity management in such devices. This chapter presents an overview of motion estimation. Beginning with an introduction to the importance of motion estimation, it systematically examines various motion estimation techniques and their strengths and weaknesses, focussing primarily on block-based motion search. It then examines the limitation of the existing techniques in accommodating performance scalability, introduces a promising approach, Distance-dependent Thresholding Search (DTS) motion search, to fill in this gap, and concludes with future research directions in the field. The authors suggest that the content of the chapter will make a significant contribution and serve as a reference for multimedia signal processing research at postgraduate level.
Threshold-free pattern-based low bit rate video coding
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2008
- Type: Text , Conference paper
- Relation: 2008 15th IEEE International Conference on Image Processing p. 1584-1587
- Full Text: false
- Reviewed:
- Description: Pattern-based video coding (PVC) has already established its superiority over recent video coding standard H.264, at low bit rate because of an extra pattern-mode to segment out the arbitrary shape of the moving region within the macroblock (MB). To determine the pattern-mode, the PVC however uses three thresholds to reduce the number of MBs coded using the pattern- mode. By setting these content-sensitive thresholds to any predefined values, the technique risks ignoring some MBs that would otherwise be selected by the rate-distortion optimization function for this mode. Consequently, the ultimate achievable performance is sacrificed to save motion estimation times. In this paper, a novel PVC scheme is proposed by removing all thresholds to determine this mode and hence more efficient performance is achieved without knowing the content of the video sequences. To keep computational complexity in check, pattern motion is approximated from the motion vector of the MB. In addition, efficient pattern similarity metric and new Lagrangian multipliers are also developed. The experimental results confirm that this new scheme improves the image quality by at least 0.5 dB and 1.0 dB compared to the existing PVC and the H.264 respectively
Video coding focusing on block partitioning and occlusion
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2010
- Type: Text , Journal article
- Relation: IEEE Transactions on Image Processing Vol. 19, no. 3 (2010), p. 691-701
- Full Text: false
- Reviewed:
- Description: Among the existing block partitioning schemes, the pattern-based video coding (PVC) has already established its superiority at low bit-rate. Its innovative segmentation process with regular-shaped pattern templates is very fast as it avoids handling the exact shape of the moving objects. It also judiciously encodes the pattern-uncovered background segments capturing high level of interblock temporal redundancy without any motion compensation, which is favoured by the rate-distortion optimizer at low bit-rates. The existing PVC technique, however, uses a number of content-sensitive thresholds and thus setting them to any predefined values risks ignoring some of the macroblocks that would otherwise be encoded with patterns. Furthermore, occluded background can potentially degrade the performance of this technique. In this paper, a robust PVC scheme is proposed by removing all the content-sensitive thresholds, introducing a new similarity metric, considering multiple top-ranked patterns by the rate-distortion optimizer, and refining the Lagrangian multiplier of the H.264 standard for efficient embedding. A novel pattern-based residual encoding approach is also integrated to address the occlusion issue. Once embedded into the H.264 Baseline profile, the proposed PVC scheme improves the image quality perceptually significantly by at least 0.5 dB in low bit-rate video coding applications. A similar trend is observed for moderate to high bit-rate applications when the proposed scheme replaces the bi-directional predictive mode in the H.264 High profile.
Video coding using arbitrarily shaped block partitions in globally optimal perspective
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2011
- Type: Text , Journal article
- Relation: EURASIP Journal on Advances in Signal Processing Vol. 16, no. (2011), p.
- Full Text:
- Reviewed:
- Description: Algorithms using content-based patterns to segment moving regions at the macroblock (MB) level have exhibited good potential for improved coding efficiency when embedded into the H.264 standard as an extra mode. The content-based pattern generation (CPG) algorithm provides local optimal result as only one pattern can be optimally generated from a given set of moving regions. But, it failed to provide optimal results for multiple patterns from entire sets. Obviously, a global optimal solution for clustering the set and then generation of multiple patterns enhances the performance farther. But a global optimal solution is not achievable due to the non-polynomial nature of the clustering problem. In this paper, we propose a near-optimal content-based pattern generation (OCPG) algorithm which outperforms the existing approach. Coupling OCPG, generating a set of patterns after clustering the MBs into several disjoint sets, with a direct pattern selection algorithm by allowing all the MBs in multiple pattern modes outperforms the existing pattern-based coding when embedded into the H.264.