Dynamic mesh commonality modeling using the cuboidal partitioning
- Authors: Ahmmed, Ashek , Paul, Manoranjan , Murshed, Manzur , Pickering, Mark
- Date: 2022
- Type: Text , Conference paper
- Relation: 2022 IEEE International Conference on Visual Communications and Image Processing, VCIP 2022, Suzhou, China, 13-16 December 2022, 2022 IEEE International Conference on Visual Communications and Image Processing, VCIP 2022
- Full Text: false
- Reviewed:
- Description: For 3D object representation, volumetric contents like meshes and point clouds provide suitable formats. However, a dynamic mesh sequence may require significantly large amount of data because it consists of information that varies with time. Hence, for the facilitation of storage and transmission of such content, efficient compression technologies are required. MPEG has started standardization activities aiming to develop a mesh compression standard that would be able to handle dynamic meshes with time varying connectivity information and time varying attribute maps. The attribute maps are features associated with the mesh surface and stored as 2D images/videos. In this paper, we propose to capture the commonality information in the dynamic mesh attribute maps using the cuboidal partitioning algorithm. This algorithm is capable of modeling both the global and local commonality within an image in a compact and computationally efficient way. Experimental results show that the proposed approach can outperform the anchor HEVC codec, suggested by MPEG to encode such sequences, with a bit rate savings of up to 3.66%. © 2022 IEEE.
Dynamic point cloud compression using a cuboid oriented discrete cosine based motion model
- Authors: Ahmmed, Ashek , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 Vol. 2021-June, p. 1935-1939
- Full Text: false
- Reviewed:
- Description: Immersive media representation format based on point clouds has underpinned significant opportunities for extended reality applications. Point cloud in its uncompressed format require very high data rate for storage and transmission. The video based point cloud compression technique projects a dynamic point cloud into geometry and texture video sequences. The projected texture video is then coded using modern video coding standard like HEVC. Since the properties of projected texture video frames are different from traditional video frames, HEVC-based commonality modeling can be inefficient. An improved commonality modeling technique is proposed that employs discrete cosine basis oriented motion models and the domains of such models are approximated by homogeneous regions called cuboids. Experimental results show that the proposed commonality modeling technique can yield savings in bit rate of up to 4.17%. ©2021 IEEE
Dynamic point cloud geometry compression using cuboid based commonality modelling framework
- Authors: Ahmmed, Ashek , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, USA, 19-21 September 2021, Proceedings - International Conference on Image Processing, ICIP Vol. 2021-September, p. 2159-2163
- Full Text: false
- Reviewed:
- Description: Point cloud in its uncompressed format require very high data rate for storage and transmission. The video based point cloud compression (V-PCC) technique projects a dynamic point cloud into geometry and texture video sequences. The projected geometry and texture video frames are then encoded using modern video coding standard like HEVC. However, HEVC encoder is unable to exploit the global commonality that exists within a geometry frame and between successive geometry frames to a greater extent. This is because in HEVC, the current frame partitioning starts from a rigid 64 × 64 pixels level without considering the structure of the scene need be coded. In this paper, an improved commonality modeling framework is proposed, by leveraging on cuboid-based frame partitioning, to encode point cloud geometry frames. The associated frame-partitioning scheme is based on statistical properties of the current geometry frame and therefore yields a flexible block partitioning structure composed of cuboids. Additionally, the proposed commonality modeling approach is computationally efficient and has a compact representation. Experimental results show that if the V-PCC reference encoder is augmented by the proposed commonality modeling technique, a bit rate savings of 2.71% and 4.25% are achieved for full body and upper body of human point clouds’ geometry sequences respectively. © 2021 IEEE.
Human-machine collaborative video coding through cuboidal partitioning
- Authors: Ahmmed, Ashek , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, USA 19-22 September 2021, Proceedings - International Conference on Image Processing, ICIP Vol. 2021-September, p. 2074-2078
- Full Text:
- Reviewed:
- Description: Video coding algorithms encode and decode an entire video frame while feature coding techniques only preserve and communicate the most critical information needed for a given application. This is because video coding targets human perception, while feature coding aims for machine vision tasks. Recently, attempts are being made to bridge the gap between these two domains. In this work, we propose a video coding framework by leveraging on to the commonality that exists between human vision and machine vision applications using cuboids. This is because cuboids, estimated rectangular regions over a video frame, are computationally efficient, has a compact representation and object centric. Such properties are already shown to add value to traditional video coding systems. Herein cuboidal feature descriptors are extracted from the current frame and then employed for accomplishing a machine vision task in the form of object detection. Experimental results show that a trained classifier yields superior average precision when equipped with cuboidal features oriented representation of the current test frame. Additionally, this representation costs 7% less in bit rate if the captured frames are need be communicated to a receiver. © 2021 IEEE.
A coarse representation of frames oriented video coding by leveraging cuboidal partitioning of image data
- Authors: Ahmmed, Ashe , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2020
- Type: Text , Conference paper
- Relation: 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020, Virtual Tampere, Finland 21-24 September 2020
- Full Text:
- Reviewed:
- Description: Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. In this work, we form a coarse representation of the current frame by minimizing commonality within that frame while preserving important structural properties of the frame. The building blocks of this coarse representation are rectangular regions called cuboids, which are computationally simple and has a compact description. Then we propose to employ the coarse frame as an additional source for predictive coding of the current frame. Experimental results show an improvement in bit rate savings over a reference codec for HEVC, with minor increase in the codec computational complexity. © 2020 IEEE.
Efficient low bit-rate intra-frame coding using common information for 360-degree video
- Authors: Afsana, Fariha , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2020
- Type: Text , Conference paper
- Relation: 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020
- Full Text: false
- Reviewed:
- Description: With the growth of video technologies, super-resolution videos, including 360-degree immersive video has become a reality due to exciting applications such as augmented/virtual/mixed reality for better interaction and a wide-angle user-view experience of a scene compared to traditional video with narrow-focused viewing angle. The new generation video contents are bandwidth-intensive in nature due to high resolution and demand high bit rate as well as low latency delivery requirements that pose challenges in solving the bottleneck of transmission and storage burdens. There is limited optimisation space in traditional video coding schemes for improving video coding efficiency in intra-frame due to the fixed size of processing block. This paper presents a new approach for improving intra-frame coding especially at low bit rate video transmission for 360-degree video for lossy mode of HEVC. Prior to using traditional HEVC intra-prediction, this approach exploits the global redundancy of entire frame by extracting common important information using multi-level discrete wavelet transformation. This paper demonstrates that the proposed method considering only low frequency information of a frame and encoding this can outperform the HEVC standard at low bit rates. The experimental results indicate that the proposed intra-frame coding strategy achieves an average of 54.07% BD-rate reduction and 2.84 dB BD-PSNR gain for low bit rate scenario compared to the HEVC. It also achieves a significant improvement in encoding time reduction of about 66.84% on an average. Moreover, this finding also demonstrates that the existing HEVC block partitioning can be applied in the transform domain for better exploitation of information concentration as we applied HEVC on wavelet frequency domain. © 2020 IEEE.
A novel depth edge prioritization based coding technique to boost-UP HEVC performance
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2016
- Type: Text , Conference paper
- Relation: 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
- Full Text: false
- Reviewed:
- Description: In addition to the texture, multiview video employs the utilization of depth coding for the reconstruction of 3D video and Free viewpoint video. Standing on some texture-depth correlations, a number of methods in literature reuses texture motion vector for the corresponding depth coding to reduce encoding time by avoiding costly motion estimation process. However, texture similarity metric is not always equivalent to the corresponding depth similarity metric especially at edge levels. Since their approaches could not explicitly detect and encode acute edge motions of depth objects, eventually, could not reach the similar or improved rate-distortion (RD) performance against the High Efficiency Video Coding (HEVC) reference test model (HM). With a view to more accurate motion detection and modeling, the proposed technique exploits an extra Pattern Mode comprising a group of pattern templates (GPTs) with different rectangular and non-rectangular object shapes and edges compared to the existing HEVC block partitioning modes. Moreover, the proposed Pattern Mode only encodes the motion areas and skips the background areas. The experimental results show that the proposed technique could save 30% encoding time and improve average 0.1dB Bjontegard Delta peak signal-to-noise ratio (BD-PSNR) compared to the HM.
Fast inter-mode decision strategy for HEVC on depth videos
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2015
- Type: Text , Conference paper
- Relation: 2015 18th International Conference on Computer and Information Technology (ICCIT) p. 288-293
- Full Text: false
- Reviewed:
- Description: Multiview video employs the utilization of both texture and depth video information from different angles to create a 3D video for more realistic view of a scene. Unlike texture, depth video is a gray scale map that represents the distance between the camera and 3D points in a scene. Existing multiview video coding (MVC) techniques including 3D-High Efficiency Video Coding (HEVC) standard encode both texture and depth videos jointly by exploiting texture video information for the corresponding depth video coding (DVC) to reduce computational time as the texture and depth videos have motion similarity in representing the same scene. This strategy has two limitations: (i) more bits and computational time might be required due to the large residuals for the misalignment between depth and texture edges and (ii) switching between different views may require more times due to the increased dependency between texture and depth. In this paper, we propose an independent DVC technique using HEVC (a video coding standard for single view) so that we can improve the rate distortion (RD) performance and reduce computational time by improving switching speed. For this, we use motion features to reduce a number of motion estimation (ME) and motion compensation (MC) modes in HEVC. As we use motion feature which is the underlying criteria for selecting different modes in the standard and then we select a subset of modes which can provide almost the same RD performance. Experimental outcomes reveal a reduction of 48% encoding time of HEVC encoder with similar RD performance and better interactivity.
Fast intermode selection for HEVC video coding using phase correlation
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur , Chakraborty, Subrata
- Date: 2015
- Type: Text , Conference proceedings , Conference paper
- Relation: 2014 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2014; Wollongong, Australia; 25th-27th November 2014 p. 1-8
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: The recent High Efficiency Video Coding (HEVC) Standard demonstrates higher rate-distortion (RD) performance compared to its predecessor H.264/AVC using different new tools especially larger and asymmetric inter-mode variable size motion estimation and compensation. This requires more than 4 times computational time compared to H.264/AVC. As a result it has always been a big concern for the researchers to reduce the amount of time while maintaining the standard quality of the video. The reduction of computational time by smart selection of the appropriate modes in HEVC is our motivation. To accomplish this task in this paper, we use phase correlation to approximate the motion information between current and reference blocks by comparing with a number of different binary pattern templates and then select a subset of motion estimation modes without exhaustively exploring all possible modes. The experimental results exhibit that the proposed HEVC-PC (HEVC with Phase Correlation) scheme outperforms the standard HEVC scheme in terms of computational time while preserving-the same quality of the video sequences. More specifically, around 40% encoding time is reduced compared to the exhaustive mode selection in HEVC. © 2014 IEEE.
- Description: 2014 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2014