Joint texture and depth coding using cuboid data compression
- Paul, Manoranjan, Chakraborty, Subrata, Murshed, Manzur, Podder, Pallab
- Authors: Paul, Manoranjan , Chakraborty, Subrata , Murshed, Manzur , Podder, Pallab
- Date: 2015
- Type: Text , Conference proceedings
- Relation: 2015 18th International Conference on Computer and Information Technology (ICCIT); Dhaka, Bangladesh; 21st-23rd December 2015 p. 138-143
- Full Text:
- Reviewed:
- Description: The latest multiview video coding (MVC) standards such as 3D-HEVC and H.264/MVC normally encodes texture and depth videos separately. Significant amount of rate-distortion performance and computational performance are sacrificed due to separate encoding due to the lack of exploitation of joint information. Obviously, separate encoding also creates synchronization issue for 3D scene formation in the decoder. Moreover, the hierarchical frame referencing architecture in the MVC creates random access frame delay. In this paper we develop an encoder and decoder framework where we can encode texture and depth video jointly by forming and encoding 3D cuboid using high dimensional entropy coding. The results from our experiments show that our proposed framework outperforms the 3D-HEVC in rate-distortion performance and reduces the computational time significantly by reducing random access frame delay.
- Authors: Paul, Manoranjan , Chakraborty, Subrata , Murshed, Manzur , Podder, Pallab
- Date: 2015
- Type: Text , Conference proceedings
- Relation: 2015 18th International Conference on Computer and Information Technology (ICCIT); Dhaka, Bangladesh; 21st-23rd December 2015 p. 138-143
- Full Text:
- Reviewed:
- Description: The latest multiview video coding (MVC) standards such as 3D-HEVC and H.264/MVC normally encodes texture and depth videos separately. Significant amount of rate-distortion performance and computational performance are sacrificed due to separate encoding due to the lack of exploitation of joint information. Obviously, separate encoding also creates synchronization issue for 3D scene formation in the decoder. Moreover, the hierarchical frame referencing architecture in the MVC creates random access frame delay. In this paper we develop an encoder and decoder framework where we can encode texture and depth video jointly by forming and encoding 3D cuboid using high dimensional entropy coding. The results from our experiments show that our proposed framework outperforms the 3D-HEVC in rate-distortion performance and reduces the computational time significantly by reducing random access frame delay.
QMET : A new quality assessment metric for no-reference video coding by using human eye traversal
- Podder, Pallab, Paul, Manoranjan, Murshed, Manzur
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2016
- Type: Text , Conference proceedings
- Relation: 2016 International Conference on Image and Vision Computing New Zealand, IVCNZ 2016; Palmerston North, New Zealand; 21st-22nd November 2016 p. 1-6
- Full Text:
- Reviewed:
- Description: The subjective quality assessment (SQA) is an ever demanding approach due to its in-depth interactivity to the human cognition. The addition of no-reference based scheme could equip the SQA techniques to tackle further challenges. Existing widely used objective metrics-peak signal-to-noise ratio (PSNR), structural similarity index (SSIM) or the subjective estimator-mean opinion score (MOS) requires original image for quality evaluation that limits their uses for the situation having no-reference. In this work, we present a no-reference based SQA technique that could be an impressive substitute to the reference-based approaches for quality evaluation. The High Efficiency Video Coding (HEVC) reference test model (HM15.0) is first exploited to generate five different qualities of the HEVC recommended eight class sequences. To assess different aspects of coded video quality, a group of ten participants are employed and their eye-tracker (ET) recorded data demonstrate closer correlation among gaze plots for relatively better quality video contents. Therefore, we innovatively calculate the amount of approximation of smooth eye traversal (ASET) by using distance, angle, and pupil-size feature from recorded gaze trajectory data and develop a new-quality metric based on eye traversal (QMET). Experimental results show that the quality evaluation carried out by QMET is highly correlated to the HM recommended coding quality. The performance of the QMET is also compared with the PSNR and SSIM metrics to justify the effectiveness of each other.
- Description: International Conference Image and Vision Computing New Zealand
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2016
- Type: Text , Conference proceedings
- Relation: 2016 International Conference on Image and Vision Computing New Zealand, IVCNZ 2016; Palmerston North, New Zealand; 21st-22nd November 2016 p. 1-6
- Full Text:
- Reviewed:
- Description: The subjective quality assessment (SQA) is an ever demanding approach due to its in-depth interactivity to the human cognition. The addition of no-reference based scheme could equip the SQA techniques to tackle further challenges. Existing widely used objective metrics-peak signal-to-noise ratio (PSNR), structural similarity index (SSIM) or the subjective estimator-mean opinion score (MOS) requires original image for quality evaluation that limits their uses for the situation having no-reference. In this work, we present a no-reference based SQA technique that could be an impressive substitute to the reference-based approaches for quality evaluation. The High Efficiency Video Coding (HEVC) reference test model (HM15.0) is first exploited to generate five different qualities of the HEVC recommended eight class sequences. To assess different aspects of coded video quality, a group of ten participants are employed and their eye-tracker (ET) recorded data demonstrate closer correlation among gaze plots for relatively better quality video contents. Therefore, we innovatively calculate the amount of approximation of smooth eye traversal (ASET) by using distance, angle, and pupil-size feature from recorded gaze trajectory data and develop a new-quality metric based on eye traversal (QMET). Experimental results show that the quality evaluation carried out by QMET is highly correlated to the HM recommended coding quality. The performance of the QMET is also compared with the PSNR and SSIM metrics to justify the effectiveness of each other.
- Description: International Conference Image and Vision Computing New Zealand
Video coding using arbitrarily shaped block partitions in globally optimal perspective
- Paul, Manoranjan, Murshed, Manzur
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2011
- Type: Text , Journal article
- Relation: EURASIP Journal on Advances in Signal Processing Vol. 16, no. (2011), p.
- Full Text:
- Reviewed:
- Description: Algorithms using content-based patterns to segment moving regions at the macroblock (MB) level have exhibited good potential for improved coding efficiency when embedded into the H.264 standard as an extra mode. The content-based pattern generation (CPG) algorithm provides local optimal result as only one pattern can be optimally generated from a given set of moving regions. But, it failed to provide optimal results for multiple patterns from entire sets. Obviously, a global optimal solution for clustering the set and then generation of multiple patterns enhances the performance farther. But a global optimal solution is not achievable due to the non-polynomial nature of the clustering problem. In this paper, we propose a near-optimal content-based pattern generation (OCPG) algorithm which outperforms the existing approach. Coupling OCPG, generating a set of patterns after clustering the MBs into several disjoint sets, with a direct pattern selection algorithm by allowing all the MBs in multiple pattern modes outperforms the existing pattern-based coding when embedded into the H.264.
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2011
- Type: Text , Journal article
- Relation: EURASIP Journal on Advances in Signal Processing Vol. 16, no. (2011), p.
- Full Text:
- Reviewed:
- Description: Algorithms using content-based patterns to segment moving regions at the macroblock (MB) level have exhibited good potential for improved coding efficiency when embedded into the H.264 standard as an extra mode. The content-based pattern generation (CPG) algorithm provides local optimal result as only one pattern can be optimally generated from a given set of moving regions. But, it failed to provide optimal results for multiple patterns from entire sets. Obviously, a global optimal solution for clustering the set and then generation of multiple patterns enhances the performance farther. But a global optimal solution is not achievable due to the non-polynomial nature of the clustering problem. In this paper, we propose a near-optimal content-based pattern generation (OCPG) algorithm which outperforms the existing approach. Coupling OCPG, generating a set of patterns after clustering the MBs into several disjoint sets, with a direct pattern selection algorithm by allowing all the MBs in multiple pattern modes outperforms the existing pattern-based coding when embedded into the H.264.
Fast mode decision in the HEVC Video coding standard by exploiting region with dominated motion and saliency features
- Podder, Pallab, Paul, Manoranjan, Murshed, Manzur
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2012
- Type: Text , Journal article
- Relation: PLoS ONE Vol. Vol.11, no. 3 (2012), p. p.e0150673
- Full Text:
- Reviewed:
- Description: The emerging High Efficiency Video Coding (HEVC) standard introduces a number of innovative and powerful coding tools to acquire better compression efficiency compared to its predecessor H.264. The encoding time complexities have also increased multiple times that is not suitable for realtime video coding applications. To address this limitation, this paper employs a novel coding strategy to reduce the time complexity in HEVC encoder by efficient selection of appropriate block-partitioning modes based on human visual features (HVF). The HVF in the proposed technique comprise with human visual attention modelling-based saliency feature and phase correlation-based motion features. The features are innovatively combined through a fusion process by developing a content-based adaptive weighted cost function to determine the region with dominated motion/saliency (RDMS)- based binary pattern for the current block. The generated binary pattern is then compared with a codebook of predefined binary pattern templates aligned to the HEVC recommended block-paritioning to estimate a subset of inter-prediction modes. Without exhaustive exploration of all modes available in the HEVC standard, only the selected subset of modes are motion estimated and motion compensated for a particular coding unit. The experimental evaluation reveals that the proposed technique notably down-scales the average computational time of the latest HEVC reference encoder by 34% while providing similar rate-distortion (RD) performance for a wide range of video sequences.
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2012
- Type: Text , Journal article
- Relation: PLoS ONE Vol. Vol.11, no. 3 (2012), p. p.e0150673
- Full Text:
- Reviewed:
- Description: The emerging High Efficiency Video Coding (HEVC) standard introduces a number of innovative and powerful coding tools to acquire better compression efficiency compared to its predecessor H.264. The encoding time complexities have also increased multiple times that is not suitable for realtime video coding applications. To address this limitation, this paper employs a novel coding strategy to reduce the time complexity in HEVC encoder by efficient selection of appropriate block-partitioning modes based on human visual features (HVF). The HVF in the proposed technique comprise with human visual attention modelling-based saliency feature and phase correlation-based motion features. The features are innovatively combined through a fusion process by developing a content-based adaptive weighted cost function to determine the region with dominated motion/saliency (RDMS)- based binary pattern for the current block. The generated binary pattern is then compared with a codebook of predefined binary pattern templates aligned to the HEVC recommended block-paritioning to estimate a subset of inter-prediction modes. Without exhaustive exploration of all modes available in the HEVC standard, only the selected subset of modes are motion estimated and motion compensated for a particular coding unit. The experimental evaluation reveals that the proposed technique notably down-scales the average computational time of the latest HEVC reference encoder by 34% while providing similar rate-distortion (RD) performance for a wide range of video sequences.
Adaptive weighted non-parametric background model for efficient video coding
- Chakraborty, Subrata, Paul, Manoranjan, Murshed, Manzur, Ali, Mortuza
- Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2017
- Type: Text , Journal article
- Relation: Neurocomputing Vol. 226, no. (2017), p. 35-45
- Full Text:
- Reviewed:
- Description: Dynamic background frame based video coding using mixture of Gaussian (MoG) based background modelling has achieved better rate distortion performance compared to the H.264 standard. However, they suffer from high computation time, low coding efficiency for dynamic videos, and prior knowledge requirement of video content. In this paper, we introduce the application of the non-parametric (NP) background modelling approach for video coding domain. We present a novel background modelling technique, called weighted non-parametric (WNP) which balances the historical trend and the recent value of the pixel intensities adaptively based on the content and characteristics of any particular video. WNP is successfully embedded into the latest HEVC video coding standard for better rate-distortion performance. Moreover, a novel scene adaptive non-parametric (SANP) technique is also developed to handle video sequences with high dynamic background. Being non-parametric, the proposed techniques naturally exhibit superior performance in dynamic background modelling without a priori knowledge of video data distribution.
- Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2017
- Type: Text , Journal article
- Relation: Neurocomputing Vol. 226, no. (2017), p. 35-45
- Full Text:
- Reviewed:
- Description: Dynamic background frame based video coding using mixture of Gaussian (MoG) based background modelling has achieved better rate distortion performance compared to the H.264 standard. However, they suffer from high computation time, low coding efficiency for dynamic videos, and prior knowledge requirement of video content. In this paper, we introduce the application of the non-parametric (NP) background modelling approach for video coding domain. We present a novel background modelling technique, called weighted non-parametric (WNP) which balances the historical trend and the recent value of the pixel intensities adaptively based on the content and characteristics of any particular video. WNP is successfully embedded into the latest HEVC video coding standard for better rate-distortion performance. Moreover, a novel scene adaptive non-parametric (SANP) technique is also developed to handle video sequences with high dynamic background. Being non-parametric, the proposed techniques naturally exhibit superior performance in dynamic background modelling without a priori knowledge of video data distribution.
Lossless hyperspectral image compression using binary tree based decomposition
- Shahriyar, Shampa, Paul, Manoranjan, Murshed, Manzur, Ali, Mortuza
- Authors: Shahriyar, Shampa , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2016
- Type: Text , Conference proceedings
- Relation: 2016 International Conference on Digital Image Computing: Techniques and Applications (Dicta); Gold Coast, Australia; 30th November-2nd December 2016 p. 428-435
- Full Text:
- Reviewed:
- Description: A Hyperspectral (HS) image provides observational powers beyond human vision capability but represents more than 100 times data compared to a traditional image. To transmit and store the huge volume of an HS image, we argue that a fundamental shift is required from the existing "original pixel intensity"based coding approaches using traditional image coders (e.g. JPEG) to the "residual" based approaches using a predictive coder exploiting band-wise correlation for better compression performance. Moreover, as HS images are used in detection or classification they need to be in original form; lossy schemes can trim off uninteresting data along with compression, which can be important to specific analysis purposes. A modified lossless HS coder is required to exploit spatial- spectral redundancy using predictive residual coding. Every spectral band of an HS image can be treated like they are the individual frame of a video to impose inter band prediction. In this paper, we propose a binary tree based lossless predictive HS coding scheme that arranges the residual frame into integer residual bitmap. High spatial correlation in HS residual frame is exploited by creating large homogeneous blocks of adaptive size, which are then coded as a unit using context based arithmetic coding. On the standard HS data set, the proposed lossless predictive coding has achieved compression ratio in the range of 1.92 to 7.94. In this paper, we compare the proposed method with mainstream lossless coders (JPEG-LS and lossless HEVC). For JPEG-LS, HEVCIntra and HEVCMain, proposed technique has reduced bit-rate by 35%, 40% and 6.79% respectively by exploiting spatial correlation in predicted HS residuals.
- Authors: Shahriyar, Shampa , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2016
- Type: Text , Conference proceedings
- Relation: 2016 International Conference on Digital Image Computing: Techniques and Applications (Dicta); Gold Coast, Australia; 30th November-2nd December 2016 p. 428-435
- Full Text:
- Reviewed:
- Description: A Hyperspectral (HS) image provides observational powers beyond human vision capability but represents more than 100 times data compared to a traditional image. To transmit and store the huge volume of an HS image, we argue that a fundamental shift is required from the existing "original pixel intensity"based coding approaches using traditional image coders (e.g. JPEG) to the "residual" based approaches using a predictive coder exploiting band-wise correlation for better compression performance. Moreover, as HS images are used in detection or classification they need to be in original form; lossy schemes can trim off uninteresting data along with compression, which can be important to specific analysis purposes. A modified lossless HS coder is required to exploit spatial- spectral redundancy using predictive residual coding. Every spectral band of an HS image can be treated like they are the individual frame of a video to impose inter band prediction. In this paper, we propose a binary tree based lossless predictive HS coding scheme that arranges the residual frame into integer residual bitmap. High spatial correlation in HS residual frame is exploited by creating large homogeneous blocks of adaptive size, which are then coded as a unit using context based arithmetic coding. On the standard HS data set, the proposed lossless predictive coding has achieved compression ratio in the range of 1.92 to 7.94. In this paper, we compare the proposed method with mainstream lossless coders (JPEG-LS and lossless HEVC). For JPEG-LS, HEVCIntra and HEVCMain, proposed technique has reduced bit-rate by 35%, 40% and 6.79% respectively by exploiting spatial correlation in predicted HS residuals.
Fast coding strategy for HEVC by motion features and saliency applied on difference between successive image blocks
- Podder, Pallab, Paul, Manoranjan, Murshed, Manzur
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2015
- Type: Text , Conference proceedings
- Relation: ConferencePacific-Rim Symposium on Image and Video Technology, Auckland, 23-27th Nov, 2016, In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).9431 p. 175-186
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: Introducing a number of innovative and powerful coding tools, the High Efficiency Video Coding (HEVC) standard promises double compression efficiency, compared to its predecessor H.264, with similar perceptual quality. The increased computational time complexity is an important issue for the video coding research community as well. An attempt to reduce this complexity of HEVC is adopted in this paper, by efficient selection of appropriate block-partitioning modes based on motion features and the saliency applied to the difference between successive image blocks. As this difference gives us the explicit visible motion and salient information, we develop a cost function by combining the motion features and image difference salient feature. The combined features are then converted into area of interest (AOI) based binary pattern for the current block. This pattern is then compared with a previously defined codebook of binary pattern templates for a subset of mode selection. Motion estimation (ME) and motion compensation (MC) are performed only on the selected subset of modes, without exhaustive exploration of all modes available in HEVC. The experimental results reveal a reduction of 42% encoding time complexity of HEVC encoder with similar subjective and objective image quality.
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2015
- Type: Text , Conference proceedings
- Relation: ConferencePacific-Rim Symposium on Image and Video Technology, Auckland, 23-27th Nov, 2016, In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).9431 p. 175-186
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: Introducing a number of innovative and powerful coding tools, the High Efficiency Video Coding (HEVC) standard promises double compression efficiency, compared to its predecessor H.264, with similar perceptual quality. The increased computational time complexity is an important issue for the video coding research community as well. An attempt to reduce this complexity of HEVC is adopted in this paper, by efficient selection of appropriate block-partitioning modes based on motion features and the saliency applied to the difference between successive image blocks. As this difference gives us the explicit visible motion and salient information, we develop a cost function by combining the motion features and image difference salient feature. The combined features are then converted into area of interest (AOI) based binary pattern for the current block. This pattern is then compared with a previously defined codebook of binary pattern templates for a subset of mode selection. Motion estimation (ME) and motion compensation (MC) are performed only on the selected subset of modes, without exhaustive exploration of all modes available in HEVC. The experimental results reveal a reduction of 42% encoding time complexity of HEVC encoder with similar subjective and objective image quality.
An efficient video coding technique using a novel non-parametric background model
- Chakraborty, Subrata, Paul, Manoranjan, Murshed, Manzur, Ali, Mortuza
- Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2014
- Type: Text , Conference proceedings
- Relation: 2014 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2014; Chengdu; China; 14th-18th July 2014 p. 1-6
- Full Text:
- Reviewed:
- Description: Video coding technique with a background frame, extracted from mixture of Gaussian (MoG) based background modeling, provides better rate distortion performance by exploiting coding efficiency in uncovered background areas compared to the latest video coding standard. However, it suffers from high computation time, low coding efficiency for dynamic videos, and prior knowledge requirement of video content. In this paper, we present a novel adaptive weighted non-parametric (WNP) background modeling technique and successfully embed it into HEVC video coding standard. Being non-parametric (NP), the proposed technique naturally exhibits superior performance in dynamic background scenarios compared to MoG-based technique without a priori knowledge of video data distribution. In addition, the WNP technique significantly reduces noise-related drawbacks of existing NP techniques to provide better quality video coding with much lower computation time as demonstrated through extensive comparative studies against NP, MoG and HEVC techniques.
- Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2014
- Type: Text , Conference proceedings
- Relation: 2014 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2014; Chengdu; China; 14th-18th July 2014 p. 1-6
- Full Text:
- Reviewed:
- Description: Video coding technique with a background frame, extracted from mixture of Gaussian (MoG) based background modeling, provides better rate distortion performance by exploiting coding efficiency in uncovered background areas compared to the latest video coding standard. However, it suffers from high computation time, low coding efficiency for dynamic videos, and prior knowledge requirement of video content. In this paper, we present a novel adaptive weighted non-parametric (WNP) background modeling technique and successfully embed it into HEVC video coding standard. Being non-parametric (NP), the proposed technique naturally exhibits superior performance in dynamic background scenarios compared to MoG-based technique without a priori knowledge of video data distribution. In addition, the WNP technique significantly reduces noise-related drawbacks of existing NP techniques to provide better quality video coding with much lower computation time as demonstrated through extensive comparative studies against NP, MoG and HEVC techniques.
Improved depth coding for HEVC focusing on depth edge approximation
- Podder, Pallab, Paul, Manoranjan, Rahaman, Motiur, Murshed, Manzur
- Authors: Podder, Pallab , Paul, Manoranjan , Rahaman, Motiur , Murshed, Manzur
- Date: 2017
- Type: Text , Journal article , acceptedVersion
- Relation: Signal Processing: Image Communication Vol. 55, no. (2017), p. 80-92
- Full Text:
- Reviewed:
- Description: The latest High Efficiency Video Coding (HEVC) standard has greatly improved the coding efficiency compared to its predecessor H.264. An important share of which is the adoption of hierarchical block partitioning structures and an extended number of modes. The structure of existing inter-modes is appropriate mainly to handle the rectangular and square aligned motion patterns. However, they could not be suitable for the block partitioning of depth objects having partial foreground motion with irregular edges and background. In such cases, the HEVC reference test model (HM) normally explores finer level block partitioning that requires more bits and encoding time to compensate large residuals. Since motion detection is the underlying criteria for mode selection, in this work, we use the energy concentration ratio feature of phase correlation to capture different types of motion in depth object. For better motion modeling focusing at depth edges, the proposed technique also uses an extra pattern mode comprising a group of templates with various rectangular and non-rectangular object shapes and edges. As the pattern mode could save bits by encoding only the foreground areas and beat all other inter-modes in a block once selected, the proposed technique could improve the rate-distortion performance. It could also reduce encoding time by skipping further branching using the pattern mode and selecting a subset of modes using innovative pre-processing criteria. Experimentally it could save 29% average encoding time and improve 0.10 dB Bjontegaard Delta peak signal-to-noise ratio compared to the HM.
- Authors: Podder, Pallab , Paul, Manoranjan , Rahaman, Motiur , Murshed, Manzur
- Date: 2017
- Type: Text , Journal article , acceptedVersion
- Relation: Signal Processing: Image Communication Vol. 55, no. (2017), p. 80-92
- Full Text:
- Reviewed:
- Description: The latest High Efficiency Video Coding (HEVC) standard has greatly improved the coding efficiency compared to its predecessor H.264. An important share of which is the adoption of hierarchical block partitioning structures and an extended number of modes. The structure of existing inter-modes is appropriate mainly to handle the rectangular and square aligned motion patterns. However, they could not be suitable for the block partitioning of depth objects having partial foreground motion with irregular edges and background. In such cases, the HEVC reference test model (HM) normally explores finer level block partitioning that requires more bits and encoding time to compensate large residuals. Since motion detection is the underlying criteria for mode selection, in this work, we use the energy concentration ratio feature of phase correlation to capture different types of motion in depth object. For better motion modeling focusing at depth edges, the proposed technique also uses an extra pattern mode comprising a group of templates with various rectangular and non-rectangular object shapes and edges. As the pattern mode could save bits by encoding only the foreground areas and beat all other inter-modes in a block once selected, the proposed technique could improve the rate-distortion performance. It could also reduce encoding time by skipping further branching using the pattern mode and selecting a subset of modes using innovative pre-processing criteria. Experimentally it could save 29% average encoding time and improve 0.10 dB Bjontegaard Delta peak signal-to-noise ratio compared to the HM.
Lossless image coding using hierarchical decomposition and recursive partitioning
- Ali, Mortuza, Murshed, Manzur, Shahriyar, Shampa, Paul, Manoranjan
- Authors: Ali, Mortuza , Murshed, Manzur , Shahriyar, Shampa , Paul, Manoranjan
- Date: 2016
- Type: Text , Journal article
- Relation: APSIPA Transactions on Signal and Information Processing Vol. 5, no. (2016), p. 1-11
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: State-Of-The-Art lossless image compression schemes, such as JPEG-LS and CALIC, have been proposed in the context-adaptive predictive coding framework. These schemes involve a prediction step followed by context-adaptive entropy coding of the residuals. However, the models for context determination proposed in the literature, have been designed using ad-hoc techniques. In this paper, we take an alternative approach where we fix a simpler context model and then rely on a systematic technique to efficiently exploit spatial correlation to achieve efficient compression. The essential idea is to decompose the image into binary bitmaps such that the spatial correlation that exists among non-binary symbols is captured as the correlation among few bit positions. The proposed scheme then encodes the bitmaps in a particular order based on the simple context model. However, instead of encoding a bitmap as a whole, we partition it into rectangular blocks, induced by a binary tree, and then independently encode the blocks. The motivation for partitioning is to explicitly identify the blocks within which the statistical correlation remains the same. On a set of standard test images, the proposed scheme, using the same predictor as JPEG-LS, achieved an overall bit-rate saving of 1.56% against JPEG-LS. © 2016 The Authors.
- Authors: Ali, Mortuza , Murshed, Manzur , Shahriyar, Shampa , Paul, Manoranjan
- Date: 2016
- Type: Text , Journal article
- Relation: APSIPA Transactions on Signal and Information Processing Vol. 5, no. (2016), p. 1-11
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: State-Of-The-Art lossless image compression schemes, such as JPEG-LS and CALIC, have been proposed in the context-adaptive predictive coding framework. These schemes involve a prediction step followed by context-adaptive entropy coding of the residuals. However, the models for context determination proposed in the literature, have been designed using ad-hoc techniques. In this paper, we take an alternative approach where we fix a simpler context model and then rely on a systematic technique to efficiently exploit spatial correlation to achieve efficient compression. The essential idea is to decompose the image into binary bitmaps such that the spatial correlation that exists among non-binary symbols is captured as the correlation among few bit positions. The proposed scheme then encodes the bitmaps in a particular order based on the simple context model. However, instead of encoding a bitmap as a whole, we partition it into rectangular blocks, induced by a binary tree, and then independently encode the blocks. The motivation for partitioning is to explicitly identify the blocks within which the statistical correlation remains the same. On a set of standard test images, the proposed scheme, using the same predictor as JPEG-LS, achieved an overall bit-rate saving of 1.56% against JPEG-LS. © 2016 The Authors.
A novel no-reference subjective quality metric for free viewpoint video using human eye movement
- Podder, Pallab, Paul, Manoranjan, Murshed, Manzur
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 8th Pacific-Rim Symposium on Image and Video Technology, PSIVT 2017; Wuhan, China; 20th-24th November 2017; published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 10749 LNCS, p. 237-251
- Full Text:
- Reviewed:
- Description: The free viewpoint video (FVV) allows users to interactively control the viewpoint and generate new views of a dynamic scene from any 3D position for better 3D visual experience with depth perception. Multiview video coding exploits both texture and depth video information from various angles to encode a number of views to facilitate FVV. The usual practice for the single view or multiview quality assessment is characterized by evolving the objective quality assessment metrics due to their simplicity and real time applications such as the peak signal-to-noise ratio (PSNR) or the structural similarity index (SSIM). However, the PSNR or SSIM requires reference image for quality evaluation and could not be successfully employed in FVV as the new view in FVV does not have any reference view to compare with. Conversely, the widely used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain knowledge, and many other factors that may actively influence on actual assessment. To address this limitation, in this work, we devise a no-reference subjective quality assessment metric by simply exploiting the pattern of human eye browsing on FVV. Over different quality contents of FVV, the participants eye-tracker recorded spatio-temporal gaze-data indicate more concentrated eye-traversing approach for relatively better quality. Thus, we calculate the Length, Angle, Pupil-size, and Gaze-duration features from the recorded gaze trajectory. The content and resolution invariant operation is carried out prior to synthesizing them using an adaptive weighted function to develop a new quality metric using eye traversal (QMET). Tested results reveal that the proposed QMET performs better than the SSIM and MOS in terms of assessing different aspects of coded video quality for a wide range of FVV contents.
- Description: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 8th Pacific-Rim Symposium on Image and Video Technology, PSIVT 2017; Wuhan, China; 20th-24th November 2017; published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 10749 LNCS, p. 237-251
- Full Text:
- Reviewed:
- Description: The free viewpoint video (FVV) allows users to interactively control the viewpoint and generate new views of a dynamic scene from any 3D position for better 3D visual experience with depth perception. Multiview video coding exploits both texture and depth video information from various angles to encode a number of views to facilitate FVV. The usual practice for the single view or multiview quality assessment is characterized by evolving the objective quality assessment metrics due to their simplicity and real time applications such as the peak signal-to-noise ratio (PSNR) or the structural similarity index (SSIM). However, the PSNR or SSIM requires reference image for quality evaluation and could not be successfully employed in FVV as the new view in FVV does not have any reference view to compare with. Conversely, the widely used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain knowledge, and many other factors that may actively influence on actual assessment. To address this limitation, in this work, we devise a no-reference subjective quality assessment metric by simply exploiting the pattern of human eye browsing on FVV. Over different quality contents of FVV, the participants eye-tracker recorded spatio-temporal gaze-data indicate more concentrated eye-traversing approach for relatively better quality. Thus, we calculate the Length, Angle, Pupil-size, and Gaze-duration features from the recorded gaze trajectory. The content and resolution invariant operation is carried out prior to synthesizing them using an adaptive weighted function to develop a new quality metric using eye traversal (QMET). Tested results reveal that the proposed QMET performs better than the SSIM and MOS in terms of assessing different aspects of coded video quality for a wide range of FVV contents.
- Description: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Efficient video coding using visual sensitive information for HEVC coding standard
- Podder, Pallab, Paul, Manoranjan, Murshed, Manzur
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2018
- Type: Text , Journal article
- Relation: IEEE Access Vol. 6, no. (2018), p. 75695-75708
- Full Text:
- Reviewed:
- Description: The latest high efficiency video coding (HEVC) standard introduces a large number of inter-mode block partitioning modes. The HEVC reference test model (HM) uses partially exhaustive tree-structured mode selection, which still explores a large number of prediction unit (PU) modes for a coding unit (CU). This impacts on encoding time rise which deprives a number of electronic devices having limited processing resources to use various features of HEVC. By analyzing the homogeneity, residual, and different statistical correlation among modes, many researchers speed-up the encoding process through the number of PU mode reduction. However, these approaches could not demonstrate the similar rate-distortion (RD) performance with the HM due to their dependency on existing Lagrangian cost function (LCF) within the HEVC framework. In this paper, to avoid the complete dependency on LCF in the initial phase, we exploit visual sensitive foreground motion and spatial salient metric (FMSSM) in a block. To capture its motion and saliency features, we use the dynamic background and visual saliency modeling, respectively. According to the FMSSM values, a subset of PU modes is then explored for encoding the CU. This preprocessing phase is independent from the existing LCF. As the proposed coding technique further reduces the number of PU modes using two simple criteria (i.e., motion and saliency), it outperforms the HM in terms of encoding time reduction. As it also encodes the uncovered and static background areas using the dynamic background frame as a substituted reference frame, it does not sacrifice quality. Tested results reveal that the proposed method achieves 32% average encoding time reduction of the HM without any quality loss for a wide range of videos.
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2018
- Type: Text , Journal article
- Relation: IEEE Access Vol. 6, no. (2018), p. 75695-75708
- Full Text:
- Reviewed:
- Description: The latest high efficiency video coding (HEVC) standard introduces a large number of inter-mode block partitioning modes. The HEVC reference test model (HM) uses partially exhaustive tree-structured mode selection, which still explores a large number of prediction unit (PU) modes for a coding unit (CU). This impacts on encoding time rise which deprives a number of electronic devices having limited processing resources to use various features of HEVC. By analyzing the homogeneity, residual, and different statistical correlation among modes, many researchers speed-up the encoding process through the number of PU mode reduction. However, these approaches could not demonstrate the similar rate-distortion (RD) performance with the HM due to their dependency on existing Lagrangian cost function (LCF) within the HEVC framework. In this paper, to avoid the complete dependency on LCF in the initial phase, we exploit visual sensitive foreground motion and spatial salient metric (FMSSM) in a block. To capture its motion and saliency features, we use the dynamic background and visual saliency modeling, respectively. According to the FMSSM values, a subset of PU modes is then explored for encoding the CU. This preprocessing phase is independent from the existing LCF. As the proposed coding technique further reduces the number of PU modes using two simple criteria (i.e., motion and saliency), it outperforms the HM in terms of encoding time reduction. As it also encodes the uncovered and static background areas using the dynamic background frame as a substituted reference frame, it does not sacrifice quality. Tested results reveal that the proposed method achieves 32% average encoding time reduction of the HM without any quality loss for a wide range of videos.
Fast intermode selection for HEVC video coding using phase correlation
- Podder, Pallab, Paul, Manoranjan, Murshed, Manzur, Chakraborty, Subrata
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur , Chakraborty, Subrata
- Date: 2015
- Type: Text , Conference proceedings , Conference paper
- Relation: 2014 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2014; Wollongong, Australia; 25th-27th November 2014 p. 1-8
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: The recent High Efficiency Video Coding (HEVC) Standard demonstrates higher rate-distortion (RD) performance compared to its predecessor H.264/AVC using different new tools especially larger and asymmetric inter-mode variable size motion estimation and compensation. This requires more than 4 times computational time compared to H.264/AVC. As a result it has always been a big concern for the researchers to reduce the amount of time while maintaining the standard quality of the video. The reduction of computational time by smart selection of the appropriate modes in HEVC is our motivation. To accomplish this task in this paper, we use phase correlation to approximate the motion information between current and reference blocks by comparing with a number of different binary pattern templates and then select a subset of motion estimation modes without exhaustively exploring all possible modes. The experimental results exhibit that the proposed HEVC-PC (HEVC with Phase Correlation) scheme outperforms the standard HEVC scheme in terms of computational time while preserving-the same quality of the video sequences. More specifically, around 40% encoding time is reduced compared to the exhaustive mode selection in HEVC. © 2014 IEEE.
- Description: 2014 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2014
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur , Chakraborty, Subrata
- Date: 2015
- Type: Text , Conference proceedings , Conference paper
- Relation: 2014 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2014; Wollongong, Australia; 25th-27th November 2014 p. 1-8
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: The recent High Efficiency Video Coding (HEVC) Standard demonstrates higher rate-distortion (RD) performance compared to its predecessor H.264/AVC using different new tools especially larger and asymmetric inter-mode variable size motion estimation and compensation. This requires more than 4 times computational time compared to H.264/AVC. As a result it has always been a big concern for the researchers to reduce the amount of time while maintaining the standard quality of the video. The reduction of computational time by smart selection of the appropriate modes in HEVC is our motivation. To accomplish this task in this paper, we use phase correlation to approximate the motion information between current and reference blocks by comparing with a number of different binary pattern templates and then select a subset of motion estimation modes without exhaustively exploring all possible modes. The experimental results exhibit that the proposed HEVC-PC (HEVC with Phase Correlation) scheme outperforms the standard HEVC scheme in terms of computational time while preserving-the same quality of the video sequences. More specifically, around 40% encoding time is reduced compared to the exhaustive mode selection in HEVC. © 2014 IEEE.
- Description: 2014 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2014
A novel quality metric using spatiotemporal correlational data of human eye maneuver
- Podder, Pallab, Paul, Manoranjan, Murshed, Manzur
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing : Techniques and Applications, DICTA 2017; Sydney, Australia; 29th November-1st December 2017 Vol. 2017-December, p. 1-8
- Full Text:
- Reviewed:
- Description: The popularly used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain expertise, and many other factors that may actively influence on actual assessment. We therefore, devise a no- reference subjective quality assessment metric by exploiting the nature of human eye browsing on videos. The participants' eye-tracker recorded gaze-data indicate more concentrated eye- traversing approach for relatively better quality. We calculate the Length, Angle, Pupil-size, and Gaze-duration features from the recorded gaze trajectory. The content and resolution invariant operation is carried out prior to synthesizing them using an adaptive weighted function to develop a new quality metric using eye traversal (QMET). Tested results reveal that the quality evaluation carried out by QMET demonstrates a strong correlation with the most widely used peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and the MOS.
- Description: DICTA 2017 - 2017 International Conference on Digital Image Computing: Techniques and Applications
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing : Techniques and Applications, DICTA 2017; Sydney, Australia; 29th November-1st December 2017 Vol. 2017-December, p. 1-8
- Full Text:
- Reviewed:
- Description: The popularly used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain expertise, and many other factors that may actively influence on actual assessment. We therefore, devise a no- reference subjective quality assessment metric by exploiting the nature of human eye browsing on videos. The participants' eye-tracker recorded gaze-data indicate more concentrated eye- traversing approach for relatively better quality. We calculate the Length, Angle, Pupil-size, and Gaze-duration features from the recorded gaze trajectory. The content and resolution invariant operation is carried out prior to synthesizing them using an adaptive weighted function to develop a new quality metric using eye traversal (QMET). Tested results reveal that the quality evaluation carried out by QMET demonstrates a strong correlation with the most widely used peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and the MOS.
- Description: DICTA 2017 - 2017 International Conference on Digital Image Computing: Techniques and Applications
Efficient HEVC scheme using motion type categorization
- Podder, Pallab, Paul, Manoranjan, Murshed, Manzur
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2014
- Type: Text , Conference proceedings
- Relation: 10th International Conference on emerging Networking EXperiments and Technologies (CoNEXT); Sydney, Australia; 2nd-5th December 2014; published in Proceedings of the 2014 Workshop on Design, Quality and Deployment of Adaptive Video Streaming p. 41-42
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: High Efficiency Video Coding (HEVC) standard introduces a number of innovative tools which can reduce approximately 50% bit-rate compared to its predecessor H.264/AVC at the same perceptual video quality whereas the computational time has increased multiple times. To reduce the encoding time while preserving the expected video quality has become a real challenge today for video transmission and streaming especially using low-powered devices. Motion estimation (ME) and motion compensation (MC) using variable-size blocks (i.e., intermodes) require 60-80% of total computational time. In this paper we propose a new efficient intermode selection technique based on phase correlation and incorporate into HEVC framework to predict ME and MC modes and perform faster intermode selection based on three dissimilar motion types in different videos. Instead of exploring all the modes exhaustively we select a subset of modes using motion type and the final mode is selected based on the Lagrangian cost function. The experimental results show that compared to HEVC the average computational time can be downscaled by 34% while providing the similar rate-distortion (RD) performance.
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2014
- Type: Text , Conference proceedings
- Relation: 10th International Conference on emerging Networking EXperiments and Technologies (CoNEXT); Sydney, Australia; 2nd-5th December 2014; published in Proceedings of the 2014 Workshop on Design, Quality and Deployment of Adaptive Video Streaming p. 41-42
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: High Efficiency Video Coding (HEVC) standard introduces a number of innovative tools which can reduce approximately 50% bit-rate compared to its predecessor H.264/AVC at the same perceptual video quality whereas the computational time has increased multiple times. To reduce the encoding time while preserving the expected video quality has become a real challenge today for video transmission and streaming especially using low-powered devices. Motion estimation (ME) and motion compensation (MC) using variable-size blocks (i.e., intermodes) require 60-80% of total computational time. In this paper we propose a new efficient intermode selection technique based on phase correlation and incorporate into HEVC framework to predict ME and MC modes and perform faster intermode selection based on three dissimilar motion types in different videos. Instead of exploring all the modes exhaustively we select a subset of modes using motion type and the final mode is selected based on the Lagrangian cost function. The experimental results show that compared to HEVC the average computational time can be downscaled by 34% while providing the similar rate-distortion (RD) performance.
Depth sequence coding with hierarchical partitioning and spatial-domain quantization
- Shahriyar, Shampa, Murshed, Manzur, Ali, Mortuza, Paul, Manoranjan
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 30, no. 3 (2020), p. 835-849
- Full Text:
- Reviewed:
- Description: Depth coding in 3D-HEVC deforms object shapes due to block-level edge-approximation and lacks efficient techniques to exploit the statistical redundancy, due to the frame-level clustering tendency in depth data, for higher coding gain at near-lossless quality. This paper presents a standalone mono-view depth sequence coder, which preserves edges implicitly by limiting quantization to the spatial-domain and exploits the frame-level clustering tendency efficiently with a novel binary tree-based decomposition (BTBD) technique. The BTBD can exploit the statistical redundancy in frame-level syntax, motion components, and residuals efficiently with fewer block-level prediction/coding modes and simpler context modeling for context-adaptive arithmetic coding. Compared with the depth coder in 3D-HEVC, the proposed one has achieved significantly lower bitrate at lossless to near-lossless quality range for mono-view coding and rendered superior quality synthetic views from the depth maps, compressed at the same bitrate, and the corresponding texture frames. © 1991-2012 IEEE.
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 30, no. 3 (2020), p. 835-849
- Full Text:
- Reviewed:
- Description: Depth coding in 3D-HEVC deforms object shapes due to block-level edge-approximation and lacks efficient techniques to exploit the statistical redundancy, due to the frame-level clustering tendency in depth data, for higher coding gain at near-lossless quality. This paper presents a standalone mono-view depth sequence coder, which preserves edges implicitly by limiting quantization to the spatial-domain and exploits the frame-level clustering tendency efficiently with a novel binary tree-based decomposition (BTBD) technique. The BTBD can exploit the statistical redundancy in frame-level syntax, motion components, and residuals efficiently with fewer block-level prediction/coding modes and simpler context modeling for context-adaptive arithmetic coding. Compared with the depth coder in 3D-HEVC, the proposed one has achieved significantly lower bitrate at lossless to near-lossless quality range for mono-view coding and rendered superior quality synthetic views from the depth maps, compressed at the same bitrate, and the corresponding texture frames. © 1991-2012 IEEE.
A novel video coding scheme using a scene adaptive non-parametric background model
- Chakraborty, Subrata, Paul, Manoranjan, Murshed, Manzur, Ali, Mortuza
- Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2014
- Type: Text , Conference paper
- Relation: 16th IEEE International Workshop on Multimedia Signal Processing, MMSP 2014 p. 1-6
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: Video coding techniques utilising background frames, provide better rate distortion performance by exploiting coding efficiency in uncovered background areas compared to the latest video coding standard. Parametric approaches such as the mixture of Gaussian (MoG) based background modeling has been widely used however they require prior knowledge about the test videos for parameter estimation. Recently introduced non-parametric (NP) based background modeling techniques successfully improved video coding performance through a HEVC integrated coding scheme. The inherent nature of the NP technique naturally exhibits superior performance in dynamic background scenarios compared to the MoG based technique without a priori knowledge of video data distribution. Although NP based coding schemes showed promising coding performances, they suffer from a number of key challenges - (a) determination of the optimal subset of training frames for generating a suitable background that can be used as a reference frame during coding, (b) incorporating dynamic changes in the background effectively after the initial background frame is generated, (c) managing frequent scene change leading to performance degradation, and (d) optimizing coding quality ratio between an I-frame and other frames under bit rate constraints. In this study we develop a new scene adaptive coding scheme using the NP based technique, capable of solving the current challenges by incorporating a new continuously updating background generation process. Extensive experimental results are also provided to validate the effectiveness of the new scheme.
- Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2014
- Type: Text , Conference paper
- Relation: 16th IEEE International Workshop on Multimedia Signal Processing, MMSP 2014 p. 1-6
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: Video coding techniques utilising background frames, provide better rate distortion performance by exploiting coding efficiency in uncovered background areas compared to the latest video coding standard. Parametric approaches such as the mixture of Gaussian (MoG) based background modeling has been widely used however they require prior knowledge about the test videos for parameter estimation. Recently introduced non-parametric (NP) based background modeling techniques successfully improved video coding performance through a HEVC integrated coding scheme. The inherent nature of the NP technique naturally exhibits superior performance in dynamic background scenarios compared to the MoG based technique without a priori knowledge of video data distribution. Although NP based coding schemes showed promising coding performances, they suffer from a number of key challenges - (a) determination of the optimal subset of training frames for generating a suitable background that can be used as a reference frame during coding, (b) incorporating dynamic changes in the background effectively after the initial background frame is generated, (c) managing frequent scene change leading to performance degradation, and (d) optimizing coding quality ratio between an I-frame and other frames under bit rate constraints. In this study we develop a new scene adaptive coding scheme using the NP based technique, capable of solving the current challenges by incorporating a new continuously updating background generation process. Extensive experimental results are also provided to validate the effectiveness of the new scheme.
Efficient high-resolution video compression scheme using background and foreground layers
- Afsana, Fariha, Paul, Manoranjan, Murshed, Manzur, Taubman, David
- Authors: Afsana, Fariha , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Access Vol. 9, no. (2021), p. 157411-157421
- Full Text:
- Reviewed:
- Description: Video coding using dynamic background frame achieves better compression compared to the traditional techniques by encoding background and foreground separately. This process reduces coding bits for the overall frame significantly; however, encoding background still requires many bits that can be compressed further for achieving better coding efficiency. The cuboid coding framework has been proven to be one of the most effective methods of image compression which exploits homogeneous pixel correlation within a frame and has better alignment with object boundary compared to traditional block-based coding. In a video sequence, the cuboid-based frame partitioning varies with the changes of the foreground. However, since the background remains static for a group of pictures, the cuboid coding exploits better spatial pixel homogeneity. In this work, the impact of cuboid coding on the background frame for high-resolution videos (Ultra-High-Definition (UHD) and 360-degree videos) is investigated using the multilayer framework of SHVC. After the cuboid partitioning, the method of coarse frame generation has been improved with a novel idea by keeping human-visual sensitive information. Unlike the traditional SHVC scheme, in the proposed method, cuboid coded background and the foreground are encoded in separate layers in an implicit manner. Simulation results show that the proposed video coding method achieves an average BD-Rate reduction of 26.69% and BD-PSNR gain of 1.51 dB against SHVC with significant encoding time reduction for both UHD and 360 videos. It also achieves an average of 13.88% BD-Rate reduction and 0.78 dB BD-PSNR gain compared to the existing relevant method proposed by X. Hoang Van. © 2013 IEEE.
- Authors: Afsana, Fariha , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Access Vol. 9, no. (2021), p. 157411-157421
- Full Text:
- Reviewed:
- Description: Video coding using dynamic background frame achieves better compression compared to the traditional techniques by encoding background and foreground separately. This process reduces coding bits for the overall frame significantly; however, encoding background still requires many bits that can be compressed further for achieving better coding efficiency. The cuboid coding framework has been proven to be one of the most effective methods of image compression which exploits homogeneous pixel correlation within a frame and has better alignment with object boundary compared to traditional block-based coding. In a video sequence, the cuboid-based frame partitioning varies with the changes of the foreground. However, since the background remains static for a group of pictures, the cuboid coding exploits better spatial pixel homogeneity. In this work, the impact of cuboid coding on the background frame for high-resolution videos (Ultra-High-Definition (UHD) and 360-degree videos) is investigated using the multilayer framework of SHVC. After the cuboid partitioning, the method of coarse frame generation has been improved with a novel idea by keeping human-visual sensitive information. Unlike the traditional SHVC scheme, in the proposed method, cuboid coded background and the foreground are encoded in separate layers in an implicit manner. Simulation results show that the proposed video coding method achieves an average BD-Rate reduction of 26.69% and BD-PSNR gain of 1.51 dB against SHVC with significant encoding time reduction for both UHD and 360 videos. It also achieves an average of 13.88% BD-Rate reduction and 0.78 dB BD-PSNR gain compared to the existing relevant method proposed by X. Hoang Van. © 2013 IEEE.
A coarse representation of frames oriented video coding by leveraging cuboidal partitioning of image data
- Ahmmed, Ashe, Paul, Manoranjan, Murshed, Manzur, Taubman, David
- Authors: Ahmmed, Ashe , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2020
- Type: Text , Conference paper
- Relation: 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020, Virtual Tampere, Finland 21-24 September 2020
- Full Text:
- Reviewed:
- Description: Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. In this work, we form a coarse representation of the current frame by minimizing commonality within that frame while preserving important structural properties of the frame. The building blocks of this coarse representation are rectangular regions called cuboids, which are computationally simple and has a compact description. Then we propose to employ the coarse frame as an additional source for predictive coding of the current frame. Experimental results show an improvement in bit rate savings over a reference codec for HEVC, with minor increase in the codec computational complexity. © 2020 IEEE.
- Authors: Ahmmed, Ashe , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2020
- Type: Text , Conference paper
- Relation: 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020, Virtual Tampere, Finland 21-24 September 2020
- Full Text:
- Reviewed:
- Description: Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. In this work, we form a coarse representation of the current frame by minimizing commonality within that frame while preserving important structural properties of the frame. The building blocks of this coarse representation are rectangular regions called cuboids, which are computationally simple and has a compact description. Then we propose to employ the coarse frame as an additional source for predictive coding of the current frame. Experimental results show an improvement in bit rate savings over a reference codec for HEVC, with minor increase in the codec computational complexity. © 2020 IEEE.
Soil moisture, organic carbon, and nitrogen content prediction with hyperspectral data using regression models
- Datta, Dristi, Paul, Manoranjan, Murshed, Manzur, Teng, Shyh Wei, Schmidtke, Leigh
- Authors: Datta, Dristi , Paul, Manoranjan , Murshed, Manzur , Teng, Shyh Wei , Schmidtke, Leigh
- Date: 2022
- Type: Text , Journal article
- Relation: Sensors (Basel, Switzerland) Vol. 22, no. 20 (2022), p.
- Full Text:
- Reviewed:
- Description: Soil moisture, soil organic carbon, and nitrogen content prediction are considered significant fields of study as they are directly related to plant health and food production. Direct estimation of these soil properties with traditional methods, for example, the oven-drying technique and chemical analysis, is a time and resource-consuming approach and can predict only smaller areas. With the significant development of remote sensing and hyperspectral (HS) imaging technologies, soil moisture, carbon, and nitrogen can be estimated over vast areas. This paper presents a generalized approach to predicting three different essential soil contents using a comprehensive study of various machine learning (ML) models by considering the dimensional reduction in feature spaces. In this study, we have used three popular benchmark HS datasets captured in Germany and Sweden. The efficacy of different ML algorithms is evaluated to predict soil content, and significant improvement is obtained when a specific range of bands is selected. The performance of ML models is further improved by applying principal component analysis (PCA), a dimensional reduction method that works with an unsupervised learning method. The effect of soil temperature on soil moisture prediction is evaluated in this study, and the results show that when the soil temperature is considered with the HS band, the soil moisture prediction accuracy does not improve. However, the combined effect of band selection and feature transformation using PCA significantly enhances the prediction accuracy for soil moisture, carbon, and nitrogen content. This study represents a comprehensive analysis of a wide range of established ML regression models using data preprocessing, effective band selection, and data dimension reduction and attempt to understand which feature combinations provide the best accuracy. The outcomes of several ML models are verified with validation techniques and the best- and worst-case scenarios in terms of soil content are noted. The proposed approach outperforms existing estimation techniques.
- Authors: Datta, Dristi , Paul, Manoranjan , Murshed, Manzur , Teng, Shyh Wei , Schmidtke, Leigh
- Date: 2022
- Type: Text , Journal article
- Relation: Sensors (Basel, Switzerland) Vol. 22, no. 20 (2022), p.
- Full Text:
- Reviewed:
- Description: Soil moisture, soil organic carbon, and nitrogen content prediction are considered significant fields of study as they are directly related to plant health and food production. Direct estimation of these soil properties with traditional methods, for example, the oven-drying technique and chemical analysis, is a time and resource-consuming approach and can predict only smaller areas. With the significant development of remote sensing and hyperspectral (HS) imaging technologies, soil moisture, carbon, and nitrogen can be estimated over vast areas. This paper presents a generalized approach to predicting three different essential soil contents using a comprehensive study of various machine learning (ML) models by considering the dimensional reduction in feature spaces. In this study, we have used three popular benchmark HS datasets captured in Germany and Sweden. The efficacy of different ML algorithms is evaluated to predict soil content, and significant improvement is obtained when a specific range of bands is selected. The performance of ML models is further improved by applying principal component analysis (PCA), a dimensional reduction method that works with an unsupervised learning method. The effect of soil temperature on soil moisture prediction is evaluated in this study, and the results show that when the soil temperature is considered with the HS band, the soil moisture prediction accuracy does not improve. However, the combined effect of band selection and feature transformation using PCA significantly enhances the prediction accuracy for soil moisture, carbon, and nitrogen content. This study represents a comprehensive analysis of a wide range of established ML regression models using data preprocessing, effective band selection, and data dimension reduction and attempt to understand which feature combinations provide the best accuracy. The outcomes of several ML models are verified with validation techniques and the best- and worst-case scenarios in terms of soil content are noted. The proposed approach outperforms existing estimation techniques.