A Centroid Algorithm for Stabilization of Turbulence-Degraded Underwater Videos
- Authors: Halder, Kalyan Kumar , Paul, Manoranjan , Tahtali, Murat , Anavatti, Sreenatha G. , Murshed, Manzur
- Date: 2016
- Type: Text , Conference paper
- Relation: 2016 International Conference on Digital Image Computing: Techniques and Applications DICTA 2016 p. 1-6
- Full Text: false
- Reviewed:
- Description: This paper addresses the problem of stabilizing underwater videos with non-uniform geometric deformations or warping due to a wavy water surface. It presents an improved method to correct these geometric deformations of the frames, providing a high-quality stabilized video output. For this purpose, a non-rigid image registration technique is employed to accurately align the warped frames with respect to a prototype frame and to estimate the deformation parameters, which in turn, are applied in an image dewarping technique. The prototype frame is chosen from the video sequence based on a sharpness assessment. The effectiveness of the proposed method is validated by applying it on both synthetic and real- world sequences using various quality metrics. A performance comparison with an existing method confirms the higher efficacy of the proposed method.
A coarse representation of frames oriented video coding by leveraging cuboidal partitioning of image data
- Authors: Ahmmed, Ashe , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2020
- Type: Text , Conference paper
- Relation: 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020, Virtual Tampere, Finland 21-24 September 2020
- Full Text:
- Reviewed:
- Description: Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. In this work, we form a coarse representation of the current frame by minimizing commonality within that frame while preserving important structural properties of the frame. The building blocks of this coarse representation are rectangular regions called cuboids, which are computationally simple and has a compact description. Then we propose to employ the coarse frame as an additional source for predictive coding of the current frame. Experimental results show an improvement in bit rate savings over a reference codec for HEVC, with minor increase in the codec computational complexity. © 2020 IEEE.
A commonality modeling framework for enhanced video coding leveraging on the cuboidal partitioning based representation of frames
- Authors: Ahmmed, Ashek , Murshed, Manzur , Paul, Manoranjan , Taubman, David
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia Vol. 24, no. (2022), p. 4446-4457
- Full Text: false
- Reviewed:
- Description: Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. Modern video coding systems are block-based wherein commonality modeling is carried out only from the perspective of the block that need be coded next. In this work, we argue for a commonality modeling approach that can provide a seamless blending between global and local homogeneity information. For this purpose, at first the frame that need be coded, is recursively partitioned into rectangular regions based on the homogeneity information of the entire frame. After that each obtained rectangular region's feature descriptor is taken to be the average value of all the pixels' intensities encompassing the region. In this way, the proposed approach generates a coarse representation of the current frame by minimizing both global and local commonality. This coarse frame is computationally simple and has a compact representation. It attempts to preserve important structural properties of the current frame which can be viewed subjectively as well as from improved rate-distortion performance of a reference scalable HEVC coder that employs the coarse frame as a reference frame for encoding the current frame. © 1999-2012 IEEE.
A hybrid object detection technique from dynamic background using Gaussian mixture models
- Authors: Haque, Mohammad , Murshed, Manzur , Paul, Manoranjan
- Date: 2008
- Type: Text , Conference paper
- Relation: 2008 IEEE 10th Workshop on Multimedia Signal Processing p. 915-920
- Full Text: false
- Reviewed:
- Description: Adaptive background modelling based object detection techniques are widely used in machine vision applications for handling the challenges of real-world multimodal background. But they are constrained to specific environment due to relying on environment specific parameters, and their performances also fluctuate across different operating speeds. On the other side, basic background subtraction (BBS) is not suitable for real applications due to manual background initialization requirement and its inability to handle repetitive multimodal background. However, it shows better stability across different operating speeds and can better eliminate noise, shadow, and trailing effect than adaptive techniques as no model adaptability or environment related parameters are involved. In this paper, we propose a hybrid object detection technique for incorporating the strengths of both approaches. In our technique, Gaussian mixture models (GMM) is used for maintaining an adaptive background model and both probabilistic and basic subtraction decisions are utilized for calculating inexpensive neighbourhood statistics for guiding the final object detection decision. Experimental results with two benchmark datasets and comparative analysis with recent adaptive object detection technique show the strength of the proposed technique in eliminating noise, shadow, and trailing effect while maintaining better stability across variable operating speeds.
A novel depth edge prioritization based coding technique to boost-UP HEVC performance
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2016
- Type: Text , Conference paper
- Relation: 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
- Full Text: false
- Reviewed:
- Description: In addition to the texture, multiview video employs the utilization of depth coding for the reconstruction of 3D video and Free viewpoint video. Standing on some texture-depth correlations, a number of methods in literature reuses texture motion vector for the corresponding depth coding to reduce encoding time by avoiding costly motion estimation process. However, texture similarity metric is not always equivalent to the corresponding depth similarity metric especially at edge levels. Since their approaches could not explicitly detect and encode acute edge motions of depth objects, eventually, could not reach the similar or improved rate-distortion (RD) performance against the High Efficiency Video Coding (HEVC) reference test model (HM). With a view to more accurate motion detection and modeling, the proposed technique exploits an extra Pattern Mode comprising a group of pattern templates (GPTs) with different rectangular and non-rectangular object shapes and edges compared to the existing HEVC block partitioning modes. Moreover, the proposed Pattern Mode only encodes the motion areas and skips the background areas. The experimental results show that the proposed technique could save 30% encoding time and improve average 0.1dB Bjontegard Delta peak signal-to-noise ratio (BD-PSNR) compared to the HM.
A novel depth motion vector coding exploiting spatial and inter-component clustering tendency
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2015
- Type: Text , Conference proceedings , Conference paper
- Relation: Visual Communications and Image Processing, VCIP 2015; Singapore; 13th-16th December 2015 p. 1-4
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text: false
- Reviewed:
- Description: Motion vectors of depth-maps in multiview and free-viewpoint videos exhibit strong spatial as well as inter-component clustering tendency. This paper presents a novel coding technique that first compresses the multidimensional bitmaps of macroblock mode and then encodes only the non-zero components of motion vectors. The bitmaps are partitioned into disjoint cuboids using binary tree based decomposition so that the 0's and 1's are either highly polarized or further sub-partitioning is unlikely to achieve any compression. Each cuboid is entropy-coded as a unit using binary arithmetic coding. This technique is capable of exploiting the spatial and inter-component correlations efficiently without the restriction of scanning the bitmap in any specific linear order as needed by run-length coding. As encoding of non-zero component values no longer requires denoting the zero value, further compression efficiency is achieved. Experimental results on standard multiview test video sequences have comprehensively demonstrated the superiority of the proposed technique, achieving overall coding gain against the state-of-the-art in the range [22%, 54%] and on average 38%. © 2015 IEEE.
- Description: 2015 Visual Communications and Image Processing, VCIP 2015
A novel motion classification based intermode selection strategy for HEVC performance improvement
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2015
- Type: Text , Journal article
- Relation: Neurocomputing Vol. 173, no. Part 3 (2015), p. 1211-1220
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text: false
- Reviewed:
- Description: High Efficiency Video Coding (HEVC) standard adopts several new approaches to achieve higher coding efficiency (approximately 50% bit-rate reduction) compared to its predecessor H.264/AVC with same perceptual image quality. Huge computational time has also increased due to the algorithmic complexity of HEVC compared to H.264/AVC. However, it is really a demanding task to reduce the encoding time while preserving the similar quality of the video sequences. In this paper, we propose a novel efficient intermode selection technique and incorporate into HEVC framework to predict motion estimation and motion compensation modes between current and reference blocks and perform faster inter mode selection based on three dissimilar motion types in divergent video sequences. Instead of exploring and traversing all the modes exhaustively, we merely select a subset of candidate modes and the final mode from the selected subset is determined based on their lowest Lagrangian cost function. The experimental results reveal that average encoding time can be downscaled by 40% with similar rate-distortion performance compared to the exhaustive mode selection strategy in HEVC.
- Description: High Efficiency Video Coding (HEVC) standard adopts several new approaches to achieve higher coding efficiency (approximately 50% bit-rate reduction) compared to its predecessor H.264/AVC with same perceptual image quality. Huge computational time has also increased due to the algorithmic complexity of HEVC compared to H.264/AVC. However, it is really a demanding task to reduce the encoding time while preserving the similar quality of the video sequences. In this paper, we propose a novel efficient intermode selection technique and incorporate into HEVC framework to predict motion estimation and motion compensation modes between current and reference blocks and perform faster inter mode selection based on three dissimilar motion types in divergent video sequences. Instead of exploring and traversing all the modes exhaustively, we merely select a subset of candidate modes and the final mode from the selected subset is determined based on their lowest Lagrangian cost function. The experimental results reveal that average encoding time can be downscaled by 40% with similar rate-distortion performance compared to the exhaustive mode selection strategy in HEVC. © 2015 Elsevier B.V.
A novel no-reference subjective quality metric for free viewpoint video using human eye movement
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 8th Pacific-Rim Symposium on Image and Video Technology, PSIVT 2017; Wuhan, China; 20th-24th November 2017; published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 10749 LNCS, p. 237-251
- Full Text:
- Reviewed:
- Description: The free viewpoint video (FVV) allows users to interactively control the viewpoint and generate new views of a dynamic scene from any 3D position for better 3D visual experience with depth perception. Multiview video coding exploits both texture and depth video information from various angles to encode a number of views to facilitate FVV. The usual practice for the single view or multiview quality assessment is characterized by evolving the objective quality assessment metrics due to their simplicity and real time applications such as the peak signal-to-noise ratio (PSNR) or the structural similarity index (SSIM). However, the PSNR or SSIM requires reference image for quality evaluation and could not be successfully employed in FVV as the new view in FVV does not have any reference view to compare with. Conversely, the widely used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain knowledge, and many other factors that may actively influence on actual assessment. To address this limitation, in this work, we devise a no-reference subjective quality assessment metric by simply exploiting the pattern of human eye browsing on FVV. Over different quality contents of FVV, the participants eye-tracker recorded spatio-temporal gaze-data indicate more concentrated eye-traversing approach for relatively better quality. Thus, we calculate the Length, Angle, Pupil-size, and Gaze-duration features from the recorded gaze trajectory. The content and resolution invariant operation is carried out prior to synthesizing them using an adaptive weighted function to develop a new quality metric using eye traversal (QMET). Tested results reveal that the proposed QMET performs better than the SSIM and MOS in terms of assessing different aspects of coded video quality for a wide range of FVV contents.
- Description: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
A novel pattern identification scheme using distributed video coding concepts
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2009
- Type: Text , Conference paper
- Relation: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2009) p. 729-732
- Full Text: false
- Reviewed:
- Description: Pattern-based video coding focusing on moving region in a macroblock has already established its superiority over recent H.264 video coding standard at very low bit rate. Obviously, a large number of pattern templates approximate the moving regions better however, after a certain limit no coding gain is observed due to the increase number of pattern identification bits. Recently, distributed video coding schemes used syndrome coding to predict the original information in decoder using side information. In this paper a novel pattern identification scheme is proposed which predicts the pattern from the syndrome codes and side information in decoder so that actual pattern identification number is not needed in the bitstream. The experimental results confirm that this new scheme successfully improves the rate-distortion performance compared to the existing pattern-based video coding as well as H.264 standard. This new scheme will also open another window of syndrome coding application.
A novel quality metric using spatiotemporal correlational data of human eye maneuver
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing : Techniques and Applications, DICTA 2017; Sydney, Australia; 29th November-1st December 2017 Vol. 2017-December, p. 1-8
- Full Text:
- Reviewed:
- Description: The popularly used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain expertise, and many other factors that may actively influence on actual assessment. We therefore, devise a no- reference subjective quality assessment metric by exploiting the nature of human eye browsing on videos. The participants' eye-tracker recorded gaze-data indicate more concentrated eye- traversing approach for relatively better quality. We calculate the Length, Angle, Pupil-size, and Gaze-duration features from the recorded gaze trajectory. The content and resolution invariant operation is carried out prior to synthesizing them using an adaptive weighted function to develop a new quality metric using eye traversal (QMET). Tested results reveal that the quality evaluation carried out by QMET demonstrates a strong correlation with the most widely used peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and the MOS.
- Description: DICTA 2017 - 2017 International Conference on Digital Image Computing: Techniques and Applications
A novel video coding scheme using a scene adaptive non-parametric background model
- Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2014
- Type: Text , Conference paper
- Relation: 16th IEEE International Workshop on Multimedia Signal Processing, MMSP 2014 p. 1-6
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: Video coding techniques utilising background frames, provide better rate distortion performance by exploiting coding efficiency in uncovered background areas compared to the latest video coding standard. Parametric approaches such as the mixture of Gaussian (MoG) based background modeling has been widely used however they require prior knowledge about the test videos for parameter estimation. Recently introduced non-parametric (NP) based background modeling techniques successfully improved video coding performance through a HEVC integrated coding scheme. The inherent nature of the NP technique naturally exhibits superior performance in dynamic background scenarios compared to the MoG based technique without a priori knowledge of video data distribution. Although NP based coding schemes showed promising coding performances, they suffer from a number of key challenges - (a) determination of the optimal subset of training frames for generating a suitable background that can be used as a reference frame during coding, (b) incorporating dynamic changes in the background effectively after the initial background frame is generated, (c) managing frequent scene change leading to performance degradation, and (d) optimizing coding quality ratio between an I-frame and other frames under bit rate constraints. In this study we develop a new scene adaptive coding scheme using the NP based technique, capable of solving the current challenges by incorporating a new continuously updating background generation process. Extensive experimental results are also provided to validate the effectiveness of the new scheme.
Adaptive weighted non-parametric background model for efficient video coding
- Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2017
- Type: Text , Journal article
- Relation: Neurocomputing Vol. 226, no. (2017), p. 35-45
- Full Text:
- Reviewed:
- Description: Dynamic background frame based video coding using mixture of Gaussian (MoG) based background modelling has achieved better rate distortion performance compared to the H.264 standard. However, they suffer from high computation time, low coding efficiency for dynamic videos, and prior knowledge requirement of video content. In this paper, we introduce the application of the non-parametric (NP) background modelling approach for video coding domain. We present a novel background modelling technique, called weighted non-parametric (WNP) which balances the historical trend and the recent value of the pixel intensities adaptively based on the content and characteristics of any particular video. WNP is successfully embedded into the latest HEVC video coding standard for better rate-distortion performance. Moreover, a novel scene adaptive non-parametric (SANP) technique is also developed to handle video sequences with high dynamic background. Being non-parametric, the proposed techniques naturally exhibit superior performance in dynamic background modelling without a priori knowledge of video data distribution.
An analysis of human engagement behaviour using descriptors from human feedback, eye tracking, and saliency modelling
- Authors: Podder, Pallab , Paul, Manoranjan , Debnath, Tanmoy , Murshed, Manzur
- Date: 2015
- Type: Text , Conference proceedings
- Relation: 2015 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2015, Adelaide, 23-25th Nov 2015 in Digital Image Computing: Techniques and Applications (DICTA), 2015 International Conference
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text: false
- Reviewed:
- Description: In this paper an analysis of human engagement behaviour with video is presented based on real life experiments. An engagement model could be employed in classroom education, enhancing programming skills, reading etc. Two groups of people, independent of one another, watched eighteen video clips separately at different times. The first group's participants' eye gaze locations, right and left pupil sizes, and eye blinking patterns are recorded by a state of the art Tobii eye tracker. The second group of people who are video experts opined about the most significant attention points of the videos. A well-known bottom-up visual saliency model, Graph-Based Visual Saliency (GBVS), is also utilized to create salient points for the videos. Taking into consideration all the above mentioned descriptors the introduced behaviour analysis demonstrates the level of participants' concentration with the videos.
An efficient video coding technique using a novel non-parametric background model
- Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2014
- Type: Text , Conference proceedings
- Relation: 2014 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2014; Chengdu; China; 14th-18th July 2014 p. 1-6
- Full Text:
- Reviewed:
- Description: Video coding technique with a background frame, extracted from mixture of Gaussian (MoG) based background modeling, provides better rate distortion performance by exploiting coding efficiency in uncovered background areas compared to the latest video coding standard. However, it suffers from high computation time, low coding efficiency for dynamic videos, and prior knowledge requirement of video content. In this paper, we present a novel adaptive weighted non-parametric (WNP) background modeling technique and successfully embed it into HEVC video coding standard. Being non-parametric (NP), the proposed technique naturally exhibits superior performance in dynamic background scenarios compared to MoG-based technique without a priori knowledge of video data distribution. In addition, the WNP technique significantly reduces noise-related drawbacks of existing NP techniques to provide better quality video coding with much lower computation time as demonstrated through extensive comparative studies against NP, MoG and HEVC techniques.
Comparative analysis of machine and deep learning models for soil properties prediction from hyperspectral visual band
- Authors: Datta, Dristi , Paul, Manoranjan , Murshed, Manzur , Teng, Shyh Wei , Schmidtke, Leigh
- Date: 2023
- Type: Text , Journal article
- Relation: Environments Vol. 10, no. 5 (2023), p. 77
- Full Text:
- Reviewed:
- Description: Estimating various properties of soil, including moisture, carbon, and nitrogen, is crucial for studying their correlation with plant health and food production. However, conventional methods such as oven-drying and chemical analysis are laborious, expensive, and only feasible for a limited land area. With the advent of remote sensing technologies like multi/hyperspectral imaging, it is now possible to predict soil properties non-invasive and cost-effectively for a large expanse of bare land. Recent research shows the possibility of predicting those soil contents from a wide range of hyperspectral data using good prediction algorithms. However, these kinds of hyperspectral sensors are expensive and not widely available. Therefore, this paper investigates different machine and deep learning techniques to predict soil nutrient properties using only the red (R), green (G), and blue (B) bands data to propose a suitable machine/deep learning model that can be used as a rapid soil test. Another objective of this research is to observe and compare the prediction accuracy in three cases i. hyperspectral band ii. full spectrum of the visual band, and iii. three-channel of RGB band and provide a guideline to the user on which spectrum information they should use to predict those soil properties. The outcome of this research helps to develop a mobile application that is easy to use for a quick soil test. This research also explores learning-based algorithms with significant feature combinations and their performance comparisons in predicting soil properties from visual band data. For this, we also explore the impact of dimensional reduction (i.e., principal component analysis) and transformations (i.e., empirical mode decomposition) of features. The results show that the proposed model can comparably predict the soil contents from the three-channel RGB data.
Cuboid coding of depth motion vectors using binary tree based decomposition
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2015
- Type: Text , Conference paper
- Relation: Data Compression Conference (DCC), 2015 p. 469
- Full Text: false
- Reviewed:
- Description: Motion vectors of depth-maps in multiview and free-viewpoint videos exhibit strong spatial as well as inter-component clustering tendency. This paper presents a novel motion vector coding technique that first compresses the multidimensional bitmaps of macro block mode information and then encodes only the non-zero components of motion vectors. The bitmaps are partitioned into disjoint cuboids using binary tree based decomposition so that the 0's and 1's are either highly polarized or further sub-partitioning is unlikely to achieve any compression. Each cuboid is entropy-coded as a unit using binary arithmetic coding. This technique is capable of exploiting the spatial and inter-component correlations efficiently without the restriction of scanning the bitmap in any specific linear order as needed by run-length coding. As encoding of non-zero component values no longer requires denoting the zero value, further compression efficiency is achieved. Experimental results on standard multiview test video sequences have comprehensively demonstrated the superiority of the proposed technique, achieving overall coding gain against the state-of-the-art in the range [17%,51%] and on average 31%.
Depth sequence coding with hierarchical partitioning and spatial-domain quantization
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 30, no. 3 (2020), p. 835-849
- Full Text:
- Reviewed:
- Description: Depth coding in 3D-HEVC deforms object shapes due to block-level edge-approximation and lacks efficient techniques to exploit the statistical redundancy, due to the frame-level clustering tendency in depth data, for higher coding gain at near-lossless quality. This paper presents a standalone mono-view depth sequence coder, which preserves edges implicitly by limiting quantization to the spatial-domain and exploits the frame-level clustering tendency efficiently with a novel binary tree-based decomposition (BTBD) technique. The BTBD can exploit the statistical redundancy in frame-level syntax, motion components, and residuals efficiently with fewer block-level prediction/coding modes and simpler context modeling for context-adaptive arithmetic coding. Compared with the depth coder in 3D-HEVC, the proposed one has achieved significantly lower bitrate at lossless to near-lossless quality range for mono-view coding and rendered superior quality synthetic views from the depth maps, compressed at the same bitrate, and the corresponding texture frames. © 1991-2012 IEEE.
Determination of munsell soil colour using smartphones
- Authors: Nodi, Sadia , Paul, Manoranjan , Robinson, Nathan , Wang, Liang , Rehman, Sabih
- Date: 2023
- Type: Text , Journal article
- Relation: Sensors Vol. 23, no. 6 (2023), p.
- Full Text:
- Reviewed:
- Description: Soil colour is one of the most important factors in agriculture for monitoring soil health and determining its properties. For this purpose, Munsell soil colour charts are widely used by archaeologists, scientists, and farmers. The process of determining soil colour from the chart is subjective and error-prone. In this study, we used popular smartphones to capture soil colours from images in the Munsell Soil Colour Book (MSCB) to determine the colour digitally. These captured soil colours are then compared with the true colour determined using a commonly used sensor (Nix Pro-2). We have observed that there are colour reading discrepancies between smartphone and Nix Pro-provided readings. To address this issue, we investigated different colour models and finally introduced a colour-intensity relationship between the images captured by Nix Pro and smartphones by exploring different distance functions. Thus, the aim of this study is to determine the Munsell soil colour accurately from the MSCB by adjusting the pixel intensity of the smartphone-captured images. Without any adjustment when the accuracy of individual Munsell soil colour determination is only (Formula presented.) for the top 5 predictions, the accuracy of the proposed method is (Formula presented.), which is significant. © 2023 by the authors.
Disparity-adjusted 3D multi-view video coding with dynamic background modelling
- Authors: Paul, Manoranjan , Evans, Christopher , Murshed, Manzur
- Date: 2013
- Type: Text , Conference paper
- Relation: Proceedings of IEEE International Conference on Image Processing (ICIP 2013). 15th-18th Sept, Melbourne, Vic. p.1719-1723
- Full Text: false
- Reviewed:
- Description: Capturing a scene using multiple cameras from different angles is expected to provide the necessary interactivity in the 3D space to satisfy end-users' demands for observing objects and actions from different angles and depths. Existing multiview video coding (MVC) technologies are not sufficiently agile to exploit the interactivity and inefficient in terms of image quality and computational time. In this paper a novel technique is proposed using disparity-adjusted 3D MVC (DA-3D-MVC) with 3D motion estimation (ME) and 3D coding to overcome the problems. In the proposed scheme, a 3D frame is formed using the same temporal frames of all disparity-adjusted views and ME is carried out for the current 3D macroblock using the immediate previous 3D frame as a reference frame. Then, 3D coding technique is used for better compression. As all the same temporal position frames of all views are encoded at the same time, the proposed scheme provides better interactivity and reduced computational time compared to the H.264/MVC. To improve the rate-distortion (RD) performance of the proposed technique, an additional reference frame comprising dynamic background is also used. Experimental results reveal that the proposed scheme outperforms the H.264/MVC in terms of RD performance, computational time, and interactivity.
Dynamic mesh commonality modeling using the cuboidal partitioning
- Authors: Ahmmed, Ashek , Paul, Manoranjan , Murshed, Manzur , Pickering, Mark
- Date: 2022
- Type: Text , Conference paper
- Relation: 2022 IEEE International Conference on Visual Communications and Image Processing, VCIP 2022, Suzhou, China, 13-16 December 2022, 2022 IEEE International Conference on Visual Communications and Image Processing, VCIP 2022
- Full Text: false
- Reviewed:
- Description: For 3D object representation, volumetric contents like meshes and point clouds provide suitable formats. However, a dynamic mesh sequence may require significantly large amount of data because it consists of information that varies with time. Hence, for the facilitation of storage and transmission of such content, efficient compression technologies are required. MPEG has started standardization activities aiming to develop a mesh compression standard that would be able to handle dynamic meshes with time varying connectivity information and time varying attribute maps. The attribute maps are features associated with the mesh surface and stored as 2D images/videos. In this paper, we propose to capture the commonality information in the dynamic mesh attribute maps using the cuboidal partitioning algorithm. This algorithm is capable of modeling both the global and local commonality within an image in a compact and computationally efficient way. Experimental results show that the proposed approach can outperform the anchor HEVC codec, suggested by MPEG to encode such sequences, with a bit rate savings of up to 3.66%. © 2022 IEEE.