Comparative analysis of machine and deep learning models for soil properties prediction from hyperspectral visual band
- Datta, Dristi, Paul, Manoranjan, Murshed, Manzur, Teng, Shyh Wei, Schmidtke, Leigh
- Authors: Datta, Dristi , Paul, Manoranjan , Murshed, Manzur , Teng, Shyh Wei , Schmidtke, Leigh
- Date: 2023
- Type: Text , Journal article
- Relation: Environments Vol. 10, no. 5 (2023), p. 77
- Full Text:
- Reviewed:
- Description: Estimating various properties of soil, including moisture, carbon, and nitrogen, is crucial for studying their correlation with plant health and food production. However, conventional methods such as oven-drying and chemical analysis are laborious, expensive, and only feasible for a limited land area. With the advent of remote sensing technologies like multi/hyperspectral imaging, it is now possible to predict soil properties non-invasive and cost-effectively for a large expanse of bare land. Recent research shows the possibility of predicting those soil contents from a wide range of hyperspectral data using good prediction algorithms. However, these kinds of hyperspectral sensors are expensive and not widely available. Therefore, this paper investigates different machine and deep learning techniques to predict soil nutrient properties using only the red (R), green (G), and blue (B) bands data to propose a suitable machine/deep learning model that can be used as a rapid soil test. Another objective of this research is to observe and compare the prediction accuracy in three cases i. hyperspectral band ii. full spectrum of the visual band, and iii. three-channel of RGB band and provide a guideline to the user on which spectrum information they should use to predict those soil properties. The outcome of this research helps to develop a mobile application that is easy to use for a quick soil test. This research also explores learning-based algorithms with significant feature combinations and their performance comparisons in predicting soil properties from visual band data. For this, we also explore the impact of dimensional reduction (i.e., principal component analysis) and transformations (i.e., empirical mode decomposition) of features. The results show that the proposed model can comparably predict the soil contents from the three-channel RGB data.
- Authors: Datta, Dristi , Paul, Manoranjan , Murshed, Manzur , Teng, Shyh Wei , Schmidtke, Leigh
- Date: 2023
- Type: Text , Journal article
- Relation: Environments Vol. 10, no. 5 (2023), p. 77
- Full Text:
- Reviewed:
- Description: Estimating various properties of soil, including moisture, carbon, and nitrogen, is crucial for studying their correlation with plant health and food production. However, conventional methods such as oven-drying and chemical analysis are laborious, expensive, and only feasible for a limited land area. With the advent of remote sensing technologies like multi/hyperspectral imaging, it is now possible to predict soil properties non-invasive and cost-effectively for a large expanse of bare land. Recent research shows the possibility of predicting those soil contents from a wide range of hyperspectral data using good prediction algorithms. However, these kinds of hyperspectral sensors are expensive and not widely available. Therefore, this paper investigates different machine and deep learning techniques to predict soil nutrient properties using only the red (R), green (G), and blue (B) bands data to propose a suitable machine/deep learning model that can be used as a rapid soil test. Another objective of this research is to observe and compare the prediction accuracy in three cases i. hyperspectral band ii. full spectrum of the visual band, and iii. three-channel of RGB band and provide a guideline to the user on which spectrum information they should use to predict those soil properties. The outcome of this research helps to develop a mobile application that is easy to use for a quick soil test. This research also explores learning-based algorithms with significant feature combinations and their performance comparisons in predicting soil properties from visual band data. For this, we also explore the impact of dimensional reduction (i.e., principal component analysis) and transformations (i.e., empirical mode decomposition) of features. The results show that the proposed model can comparably predict the soil contents from the three-channel RGB data.
Depth-based sampling and steering constraints for memoryless local planners
- Nguyen, Binh, Nguyen, Linh, Choudhury, Tanveer, Keogh, Kathleen, Murshed, Manzur
- Authors: Nguyen, Binh , Nguyen, Linh , Choudhury, Tanveer , Keogh, Kathleen , Murshed, Manzur
- Date: 2023
- Type: Text , Journal article
- Relation: Journal of Intelligent and Robotic Systems: Theory and Applications Vol. 109, no. 3 (2023), p.
- Full Text:
- Reviewed:
- Description: By utilizing only depth information, the paper introduces a novel two-stage planning approach that enhances computational efficiency and planning performances for memoryless local planners. First, a depth-based sampling technique is proposed to identify and eliminate a specific type of in-collision trajectories among sampled candidates. Specifically, all trajectories that have obscured endpoints are found through querying the depth values and will then be excluded from the sampled set, which can significantly reduce the computational workload required in collision checking. Subsequently, we apply a tailored local planning algorithm that employs a direction cost function and a depth-based steering mechanism to prevent the robot from being trapped in local minima. Our planning algorithm is theoretically proven to be complete in convex obstacle scenarios. To validate the effectiveness of our DEpth-based both Sampling and Steering (DESS) approaches, we conducted experiments in simulated environments where a quadrotor flew through cluttered regions with multiple various-sized obstacles. The experimental results show that DESS significantly reduces computation time in local planning compared to the uniform sampling method, resulting in the planned trajectory with a lower minimized cost. More importantly, our success rates for navigation to different destinations in testing scenarios are improved considerably compared to the fixed-yawing approach. © 2023, The Author(s).
- Authors: Nguyen, Binh , Nguyen, Linh , Choudhury, Tanveer , Keogh, Kathleen , Murshed, Manzur
- Date: 2023
- Type: Text , Journal article
- Relation: Journal of Intelligent and Robotic Systems: Theory and Applications Vol. 109, no. 3 (2023), p.
- Full Text:
- Reviewed:
- Description: By utilizing only depth information, the paper introduces a novel two-stage planning approach that enhances computational efficiency and planning performances for memoryless local planners. First, a depth-based sampling technique is proposed to identify and eliminate a specific type of in-collision trajectories among sampled candidates. Specifically, all trajectories that have obscured endpoints are found through querying the depth values and will then be excluded from the sampled set, which can significantly reduce the computational workload required in collision checking. Subsequently, we apply a tailored local planning algorithm that employs a direction cost function and a depth-based steering mechanism to prevent the robot from being trapped in local minima. Our planning algorithm is theoretically proven to be complete in convex obstacle scenarios. To validate the effectiveness of our DEpth-based both Sampling and Steering (DESS) approaches, we conducted experiments in simulated environments where a quadrotor flew through cluttered regions with multiple various-sized obstacles. The experimental results show that DESS significantly reduces computation time in local planning compared to the uniform sampling method, resulting in the planned trajectory with a lower minimized cost. More importantly, our success rates for navigation to different destinations in testing scenarios are improved considerably compared to the fixed-yawing approach. © 2023, The Author(s).
Bidirectional mapping coupled GAN for generalized zero-shot learning
- Shermin, Tasfia, Teng, Shyh, Sohel, Ferdous, Murshed, Manzur, Lu, Guojun
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Image Processing Vol. 31, no. (2022), p. 721-733
- Full Text:
- Reviewed:
- Description: Bidirectional mapping-based generalized zero-shot learning (GZSL) methods rely on the quality of synthesized features to recognize seen and unseen data. Therefore, learning a joint distribution of seen-unseen classes and preserving the distinction between seen-unseen classes is crucial for GZSL methods. However, existing methods only learn the underlying distribution of seen data, although unseen class semantics are available in the GZSL problem setting. Most methods neglect retaining seen-unseen classes distinction and use the learned distribution to recognize seen and unseen data. Consequently, they do not perform well. In this work, we utilize the available unseen class semantics alongside seen class semantics and learn joint distribution through a strong visual-semantic coupling. We propose a bidirectional mapping coupled generative adversarial network (BMCoGAN) by extending the concept of the coupled generative adversarial network into a bidirectional mapping model. We further integrate a Wasserstein generative adversarial optimization to supervise the joint distribution learning. We design a loss optimization for retaining distinctive information of seen-unseen classes in the synthesized features and reducing bias towards seen classes, which pushes synthesized seen features towards real seen features and pulls synthesized unseen features away from real seen features. We evaluate BMCoGAN on benchmark datasets and demonstrate its superior performance against contemporary methods. © 1992-2012 IEEE.
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Image Processing Vol. 31, no. (2022), p. 721-733
- Full Text:
- Reviewed:
- Description: Bidirectional mapping-based generalized zero-shot learning (GZSL) methods rely on the quality of synthesized features to recognize seen and unseen data. Therefore, learning a joint distribution of seen-unseen classes and preserving the distinction between seen-unseen classes is crucial for GZSL methods. However, existing methods only learn the underlying distribution of seen data, although unseen class semantics are available in the GZSL problem setting. Most methods neglect retaining seen-unseen classes distinction and use the learned distribution to recognize seen and unseen data. Consequently, they do not perform well. In this work, we utilize the available unseen class semantics alongside seen class semantics and learn joint distribution through a strong visual-semantic coupling. We propose a bidirectional mapping coupled generative adversarial network (BMCoGAN) by extending the concept of the coupled generative adversarial network into a bidirectional mapping model. We further integrate a Wasserstein generative adversarial optimization to supervise the joint distribution learning. We design a loss optimization for retaining distinctive information of seen-unseen classes in the synthesized features and reducing bias towards seen classes, which pushes synthesized seen features towards real seen features and pulls synthesized unseen features away from real seen features. We evaluate BMCoGAN on benchmark datasets and demonstrate its superior performance against contemporary methods. © 1992-2012 IEEE.
Integrated generalized zero-shot learning for fine-grained classification
- Shermin, Tasfia, Teng, Shyh, Sohel, Ferdous, Murshed, Manzur, Lu, Guojun
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 122, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Embedding learning (EL) and feature synthesizing (FS) are two of the popular categories of fine-grained GZSL methods. EL or FS using global features cannot discriminate fine details in the absence of local features. On the other hand, EL or FS methods exploiting local features either neglect direct attribute guidance or global information. Consequently, neither method performs well. In this paper, we propose to explore global and direct attribute-supervised local visual features for both EL and FS categories in an integrated manner for fine-grained GZSL. The proposed integrated network has an EL sub-network and a FS sub-network. Consequently, the proposed integrated network can be tested in two ways. We propose a novel two-step dense attention mechanism to discover attribute-guided local visual features. We introduce new mutual learning between the sub-networks to exploit mutually beneficial information for optimization. Moreover, we propose to compute source-target class similarity based on mutual information and transfer-learn the target classes to reduce bias towards the source domain during testing. We demonstrate that our proposed method outperforms contemporary methods on benchmark datasets. © 2021 Elsevier Ltd
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 122, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Embedding learning (EL) and feature synthesizing (FS) are two of the popular categories of fine-grained GZSL methods. EL or FS using global features cannot discriminate fine details in the absence of local features. On the other hand, EL or FS methods exploiting local features either neglect direct attribute guidance or global information. Consequently, neither method performs well. In this paper, we propose to explore global and direct attribute-supervised local visual features for both EL and FS categories in an integrated manner for fine-grained GZSL. The proposed integrated network has an EL sub-network and a FS sub-network. Consequently, the proposed integrated network can be tested in two ways. We propose a novel two-step dense attention mechanism to discover attribute-guided local visual features. We introduce new mutual learning between the sub-networks to exploit mutually beneficial information for optimization. Moreover, we propose to compute source-target class similarity based on mutual information and transfer-learn the target classes to reduce bias towards the source domain during testing. We demonstrate that our proposed method outperforms contemporary methods on benchmark datasets. © 2021 Elsevier Ltd
Soil moisture, organic carbon, and nitrogen content prediction with hyperspectral data using regression models
- Datta, Dristi, Paul, Manoranjan, Murshed, Manzur, Teng, Shyh Wei, Schmidtke, Leigh
- Authors: Datta, Dristi , Paul, Manoranjan , Murshed, Manzur , Teng, Shyh Wei , Schmidtke, Leigh
- Date: 2022
- Type: Text , Journal article
- Relation: Sensors (Basel, Switzerland) Vol. 22, no. 20 (2022), p.
- Full Text:
- Reviewed:
- Description: Soil moisture, soil organic carbon, and nitrogen content prediction are considered significant fields of study as they are directly related to plant health and food production. Direct estimation of these soil properties with traditional methods, for example, the oven-drying technique and chemical analysis, is a time and resource-consuming approach and can predict only smaller areas. With the significant development of remote sensing and hyperspectral (HS) imaging technologies, soil moisture, carbon, and nitrogen can be estimated over vast areas. This paper presents a generalized approach to predicting three different essential soil contents using a comprehensive study of various machine learning (ML) models by considering the dimensional reduction in feature spaces. In this study, we have used three popular benchmark HS datasets captured in Germany and Sweden. The efficacy of different ML algorithms is evaluated to predict soil content, and significant improvement is obtained when a specific range of bands is selected. The performance of ML models is further improved by applying principal component analysis (PCA), a dimensional reduction method that works with an unsupervised learning method. The effect of soil temperature on soil moisture prediction is evaluated in this study, and the results show that when the soil temperature is considered with the HS band, the soil moisture prediction accuracy does not improve. However, the combined effect of band selection and feature transformation using PCA significantly enhances the prediction accuracy for soil moisture, carbon, and nitrogen content. This study represents a comprehensive analysis of a wide range of established ML regression models using data preprocessing, effective band selection, and data dimension reduction and attempt to understand which feature combinations provide the best accuracy. The outcomes of several ML models are verified with validation techniques and the best- and worst-case scenarios in terms of soil content are noted. The proposed approach outperforms existing estimation techniques.
- Authors: Datta, Dristi , Paul, Manoranjan , Murshed, Manzur , Teng, Shyh Wei , Schmidtke, Leigh
- Date: 2022
- Type: Text , Journal article
- Relation: Sensors (Basel, Switzerland) Vol. 22, no. 20 (2022), p.
- Full Text:
- Reviewed:
- Description: Soil moisture, soil organic carbon, and nitrogen content prediction are considered significant fields of study as they are directly related to plant health and food production. Direct estimation of these soil properties with traditional methods, for example, the oven-drying technique and chemical analysis, is a time and resource-consuming approach and can predict only smaller areas. With the significant development of remote sensing and hyperspectral (HS) imaging technologies, soil moisture, carbon, and nitrogen can be estimated over vast areas. This paper presents a generalized approach to predicting three different essential soil contents using a comprehensive study of various machine learning (ML) models by considering the dimensional reduction in feature spaces. In this study, we have used three popular benchmark HS datasets captured in Germany and Sweden. The efficacy of different ML algorithms is evaluated to predict soil content, and significant improvement is obtained when a specific range of bands is selected. The performance of ML models is further improved by applying principal component analysis (PCA), a dimensional reduction method that works with an unsupervised learning method. The effect of soil temperature on soil moisture prediction is evaluated in this study, and the results show that when the soil temperature is considered with the HS band, the soil moisture prediction accuracy does not improve. However, the combined effect of band selection and feature transformation using PCA significantly enhances the prediction accuracy for soil moisture, carbon, and nitrogen content. This study represents a comprehensive analysis of a wide range of established ML regression models using data preprocessing, effective band selection, and data dimension reduction and attempt to understand which feature combinations provide the best accuracy. The outcomes of several ML models are verified with validation techniques and the best- and worst-case scenarios in terms of soil content are noted. The proposed approach outperforms existing estimation techniques.
Adversarial network with multiple classifiers for open set domain adaptation
- Shermin, Tasfia, Lu, Guojun, Teng, Shyh, Murshed, Manzur, Sohel, Ferdous
- Authors: Shermin, Tasfia , Lu, Guojun , Teng, Shyh , Murshed, Manzur , Sohel, Ferdous
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia Vol. 23, no. (2021), p. 2732-2744
- Full Text:
- Reviewed:
- Description: Domain adaptation aims to transfer knowledge from a domain with adequate labeled samples to a domain with scarce labeled samples. Prior research has introduced various open set domain adaptation settings in the literature to extend the applications of domain adaptation methods in real-world scenarios. This paper focuses on the type of open set domain adaptation setting where the target domain has both private ('unknown classes') label space and the shared ('known classes') label space. However, the source domain only has the 'known classes' label space. Prevalent distribution-matching domain adaptation methods are inadequate in such a setting that demands adaptation from a smaller source domain to a larger and diverse target domain with more classes. For addressing this specific open set domain adaptation setting, prior research introduces a domain adversarial model that uses a fixed threshold for distinguishing known from unknown target samples and lacks at handling negative transfers. We extend their adversarial model and propose a novel adversarial domain adaptation model with multiple auxiliary classifiers. The proposed multi-classifier structure introduces a weighting module that evaluates distinctive domain characteristics for assigning the target samples with weights which are more representative to whether they are likely to belong to the known and unknown classes to encourage positive transfers during adversarial training and simultaneously reduces the domain gap between the shared classes of the source and target domains. A thorough experimental investigation shows that our proposed method outperforms existing domain adaptation methods on a number of domain adaptation datasets. © 1999-2012 IEEE.
- Authors: Shermin, Tasfia , Lu, Guojun , Teng, Shyh , Murshed, Manzur , Sohel, Ferdous
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia Vol. 23, no. (2021), p. 2732-2744
- Full Text:
- Reviewed:
- Description: Domain adaptation aims to transfer knowledge from a domain with adequate labeled samples to a domain with scarce labeled samples. Prior research has introduced various open set domain adaptation settings in the literature to extend the applications of domain adaptation methods in real-world scenarios. This paper focuses on the type of open set domain adaptation setting where the target domain has both private ('unknown classes') label space and the shared ('known classes') label space. However, the source domain only has the 'known classes' label space. Prevalent distribution-matching domain adaptation methods are inadequate in such a setting that demands adaptation from a smaller source domain to a larger and diverse target domain with more classes. For addressing this specific open set domain adaptation setting, prior research introduces a domain adversarial model that uses a fixed threshold for distinguishing known from unknown target samples and lacks at handling negative transfers. We extend their adversarial model and propose a novel adversarial domain adaptation model with multiple auxiliary classifiers. The proposed multi-classifier structure introduces a weighting module that evaluates distinctive domain characteristics for assigning the target samples with weights which are more representative to whether they are likely to belong to the known and unknown classes to encourage positive transfers during adversarial training and simultaneously reduces the domain gap between the shared classes of the source and target domains. A thorough experimental investigation shows that our proposed method outperforms existing domain adaptation methods on a number of domain adaptation datasets. © 1999-2012 IEEE.
Efficient high-resolution video compression scheme using background and foreground layers
- Afsana, Fariha, Paul, Manoranjan, Murshed, Manzur, Taubman, David
- Authors: Afsana, Fariha , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Access Vol. 9, no. (2021), p. 157411-157421
- Full Text:
- Reviewed:
- Description: Video coding using dynamic background frame achieves better compression compared to the traditional techniques by encoding background and foreground separately. This process reduces coding bits for the overall frame significantly; however, encoding background still requires many bits that can be compressed further for achieving better coding efficiency. The cuboid coding framework has been proven to be one of the most effective methods of image compression which exploits homogeneous pixel correlation within a frame and has better alignment with object boundary compared to traditional block-based coding. In a video sequence, the cuboid-based frame partitioning varies with the changes of the foreground. However, since the background remains static for a group of pictures, the cuboid coding exploits better spatial pixel homogeneity. In this work, the impact of cuboid coding on the background frame for high-resolution videos (Ultra-High-Definition (UHD) and 360-degree videos) is investigated using the multilayer framework of SHVC. After the cuboid partitioning, the method of coarse frame generation has been improved with a novel idea by keeping human-visual sensitive information. Unlike the traditional SHVC scheme, in the proposed method, cuboid coded background and the foreground are encoded in separate layers in an implicit manner. Simulation results show that the proposed video coding method achieves an average BD-Rate reduction of 26.69% and BD-PSNR gain of 1.51 dB against SHVC with significant encoding time reduction for both UHD and 360 videos. It also achieves an average of 13.88% BD-Rate reduction and 0.78 dB BD-PSNR gain compared to the existing relevant method proposed by X. Hoang Van. © 2013 IEEE.
- Authors: Afsana, Fariha , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Access Vol. 9, no. (2021), p. 157411-157421
- Full Text:
- Reviewed:
- Description: Video coding using dynamic background frame achieves better compression compared to the traditional techniques by encoding background and foreground separately. This process reduces coding bits for the overall frame significantly; however, encoding background still requires many bits that can be compressed further for achieving better coding efficiency. The cuboid coding framework has been proven to be one of the most effective methods of image compression which exploits homogeneous pixel correlation within a frame and has better alignment with object boundary compared to traditional block-based coding. In a video sequence, the cuboid-based frame partitioning varies with the changes of the foreground. However, since the background remains static for a group of pictures, the cuboid coding exploits better spatial pixel homogeneity. In this work, the impact of cuboid coding on the background frame for high-resolution videos (Ultra-High-Definition (UHD) and 360-degree videos) is investigated using the multilayer framework of SHVC. After the cuboid partitioning, the method of coarse frame generation has been improved with a novel idea by keeping human-visual sensitive information. Unlike the traditional SHVC scheme, in the proposed method, cuboid coded background and the foreground are encoded in separate layers in an implicit manner. Simulation results show that the proposed video coding method achieves an average BD-Rate reduction of 26.69% and BD-PSNR gain of 1.51 dB against SHVC with significant encoding time reduction for both UHD and 360 videos. It also achieves an average of 13.88% BD-Rate reduction and 0.78 dB BD-PSNR gain compared to the existing relevant method proposed by X. Hoang Van. © 2013 IEEE.
Human-machine collaborative video coding through cuboidal partitioning
- Ahmmed, Ashek, Paul, Manoranjan, Murshed, Manzur, Taubman, David
- Authors: Ahmmed, Ashek , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, USA 19-22 September 2021, Proceedings - International Conference on Image Processing, ICIP Vol. 2021-September, p. 2074-2078
- Full Text:
- Reviewed:
- Description: Video coding algorithms encode and decode an entire video frame while feature coding techniques only preserve and communicate the most critical information needed for a given application. This is because video coding targets human perception, while feature coding aims for machine vision tasks. Recently, attempts are being made to bridge the gap between these two domains. In this work, we propose a video coding framework by leveraging on to the commonality that exists between human vision and machine vision applications using cuboids. This is because cuboids, estimated rectangular regions over a video frame, are computationally efficient, has a compact representation and object centric. Such properties are already shown to add value to traditional video coding systems. Herein cuboidal feature descriptors are extracted from the current frame and then employed for accomplishing a machine vision task in the form of object detection. Experimental results show that a trained classifier yields superior average precision when equipped with cuboidal features oriented representation of the current test frame. Additionally, this representation costs 7% less in bit rate if the captured frames are need be communicated to a receiver. © 2021 IEEE.
- Authors: Ahmmed, Ashek , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, USA 19-22 September 2021, Proceedings - International Conference on Image Processing, ICIP Vol. 2021-September, p. 2074-2078
- Full Text:
- Reviewed:
- Description: Video coding algorithms encode and decode an entire video frame while feature coding techniques only preserve and communicate the most critical information needed for a given application. This is because video coding targets human perception, while feature coding aims for machine vision tasks. Recently, attempts are being made to bridge the gap between these two domains. In this work, we propose a video coding framework by leveraging on to the commonality that exists between human vision and machine vision applications using cuboids. This is because cuboids, estimated rectangular regions over a video frame, are computationally efficient, has a compact representation and object centric. Such properties are already shown to add value to traditional video coding systems. Herein cuboidal feature descriptors are extracted from the current frame and then employed for accomplishing a machine vision task in the form of object detection. Experimental results show that a trained classifier yields superior average precision when equipped with cuboidal features oriented representation of the current test frame. Additionally, this representation costs 7% less in bit rate if the captured frames are need be communicated to a receiver. © 2021 IEEE.
A coarse representation of frames oriented video coding by leveraging cuboidal partitioning of image data
- Ahmmed, Ashe, Paul, Manoranjan, Murshed, Manzur, Taubman, David
- Authors: Ahmmed, Ashe , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2020
- Type: Text , Conference paper
- Relation: 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020, Virtual Tampere, Finland 21-24 September 2020
- Full Text:
- Reviewed:
- Description: Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. In this work, we form a coarse representation of the current frame by minimizing commonality within that frame while preserving important structural properties of the frame. The building blocks of this coarse representation are rectangular regions called cuboids, which are computationally simple and has a compact description. Then we propose to employ the coarse frame as an additional source for predictive coding of the current frame. Experimental results show an improvement in bit rate savings over a reference codec for HEVC, with minor increase in the codec computational complexity. © 2020 IEEE.
- Authors: Ahmmed, Ashe , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2020
- Type: Text , Conference paper
- Relation: 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020, Virtual Tampere, Finland 21-24 September 2020
- Full Text:
- Reviewed:
- Description: Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. In this work, we form a coarse representation of the current frame by minimizing commonality within that frame while preserving important structural properties of the frame. The building blocks of this coarse representation are rectangular regions called cuboids, which are computationally simple and has a compact description. Then we propose to employ the coarse frame as an additional source for predictive coding of the current frame. Experimental results show an improvement in bit rate savings over a reference codec for HEVC, with minor increase in the codec computational complexity. © 2020 IEEE.
A robust forgery detection method for copy-move and splicing attacks in images
- Islam, Mohammad, Karmakar, Gour, Kamruzzaman, Joarder, Murshed, Manzur
- Authors: Islam, Mohammad , Karmakar, Gour , Kamruzzaman, Joarder , Murshed, Manzur
- Date: 2020
- Type: Text , Journal article
- Relation: Electronics Vol. 9, no. 9 (2020), p. 1-22
- Full Text:
- Reviewed:
- Description: Internet of Things (IoT) image sensors, social media, and smartphones generate huge volumes of digital images every day. Easy availability and usability of photo editing tools have made forgery attacks, primarily splicing and copy-move attacks, effortless, causing cybercrimes to be on the rise. While several models have been proposed in the literature for detecting these attacks, the robustness of those models has not been investigated when (i) a low number of tampered images are available for model building or (ii) images from IoT sensors are distorted due to image rotation or scaling caused by unwanted or unexpected changes in sensors' physical set-up. Moreover, further improvement in detection accuracy is needed for real-word security management systems. To address these limitations, in this paper, an innovative image forgery detection method has been proposed based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) and a new feature extraction method using the mean operator. First, images are divided into non-overlapping fixed size blocks and 2D block DCT is applied to capture changes due to image forgery. Then LBP is applied to the magnitude of the DCT array to enhance forgery artifacts. Finally, the mean value of a particular cell across all LBP blocks is computed, which yields a fixed number of features and presents a more computationally efficient method. Using Support Vector Machine (SVM), the proposed method has been extensively tested on four well known publicly available gray scale and color image forgery datasets, and additionally on an IoT based image forgery dataset that we built. Experimental results reveal the superiority of our proposed method over recent state-of-the-art methods in terms of widely used performance metrics and computational time and demonstrate robustness against low availability of forged training samples.
- Description: This research was funded by Research Priority Area (RPA) scholarship of Federation University Australia.
- Authors: Islam, Mohammad , Karmakar, Gour , Kamruzzaman, Joarder , Murshed, Manzur
- Date: 2020
- Type: Text , Journal article
- Relation: Electronics Vol. 9, no. 9 (2020), p. 1-22
- Full Text:
- Reviewed:
- Description: Internet of Things (IoT) image sensors, social media, and smartphones generate huge volumes of digital images every day. Easy availability and usability of photo editing tools have made forgery attacks, primarily splicing and copy-move attacks, effortless, causing cybercrimes to be on the rise. While several models have been proposed in the literature for detecting these attacks, the robustness of those models has not been investigated when (i) a low number of tampered images are available for model building or (ii) images from IoT sensors are distorted due to image rotation or scaling caused by unwanted or unexpected changes in sensors' physical set-up. Moreover, further improvement in detection accuracy is needed for real-word security management systems. To address these limitations, in this paper, an innovative image forgery detection method has been proposed based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) and a new feature extraction method using the mean operator. First, images are divided into non-overlapping fixed size blocks and 2D block DCT is applied to capture changes due to image forgery. Then LBP is applied to the magnitude of the DCT array to enhance forgery artifacts. Finally, the mean value of a particular cell across all LBP blocks is computed, which yields a fixed number of features and presents a more computationally efficient method. Using Support Vector Machine (SVM), the proposed method has been extensively tested on four well known publicly available gray scale and color image forgery datasets, and additionally on an IoT based image forgery dataset that we built. Experimental results reveal the superiority of our proposed method over recent state-of-the-art methods in terms of widely used performance metrics and computational time and demonstrate robustness against low availability of forged training samples.
- Description: This research was funded by Research Priority Area (RPA) scholarship of Federation University Australia.
Depth sequence coding with hierarchical partitioning and spatial-domain quantization
- Shahriyar, Shampa, Murshed, Manzur, Ali, Mortuza, Paul, Manoranjan
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 30, no. 3 (2020), p. 835-849
- Full Text:
- Reviewed:
- Description: Depth coding in 3D-HEVC deforms object shapes due to block-level edge-approximation and lacks efficient techniques to exploit the statistical redundancy, due to the frame-level clustering tendency in depth data, for higher coding gain at near-lossless quality. This paper presents a standalone mono-view depth sequence coder, which preserves edges implicitly by limiting quantization to the spatial-domain and exploits the frame-level clustering tendency efficiently with a novel binary tree-based decomposition (BTBD) technique. The BTBD can exploit the statistical redundancy in frame-level syntax, motion components, and residuals efficiently with fewer block-level prediction/coding modes and simpler context modeling for context-adaptive arithmetic coding. Compared with the depth coder in 3D-HEVC, the proposed one has achieved significantly lower bitrate at lossless to near-lossless quality range for mono-view coding and rendered superior quality synthetic views from the depth maps, compressed at the same bitrate, and the corresponding texture frames. © 1991-2012 IEEE.
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 30, no. 3 (2020), p. 835-849
- Full Text:
- Reviewed:
- Description: Depth coding in 3D-HEVC deforms object shapes due to block-level edge-approximation and lacks efficient techniques to exploit the statistical redundancy, due to the frame-level clustering tendency in depth data, for higher coding gain at near-lossless quality. This paper presents a standalone mono-view depth sequence coder, which preserves edges implicitly by limiting quantization to the spatial-domain and exploits the frame-level clustering tendency efficiently with a novel binary tree-based decomposition (BTBD) technique. The BTBD can exploit the statistical redundancy in frame-level syntax, motion components, and residuals efficiently with fewer block-level prediction/coding modes and simpler context modeling for context-adaptive arithmetic coding. Compared with the depth coder in 3D-HEVC, the proposed one has achieved significantly lower bitrate at lossless to near-lossless quality range for mono-view coding and rendered superior quality synthetic views from the depth maps, compressed at the same bitrate, and the corresponding texture frames. © 1991-2012 IEEE.
An efficient RANSAC hypothesis evaluation using sufficient statistics for RGB-D pose estimation
- Senthooran, Ilankalkone, Murshed, Manzur, Barca, Jan, Kamruzzaman, Joarder, Chung, Hoam
- Authors: Senthooran, Ilankalkone , Murshed, Manzur , Barca, Jan , Kamruzzaman, Joarder , Chung, Hoam
- Date: 2019
- Type: Text , Journal article
- Relation: Autonomous Robots Vol. 43, no. 5 (2019), p. 1257-1270
- Full Text:
- Reviewed:
- Description: Achieving autonomous flight in GPS-denied environments begins with pose estimation in three-dimensional space, and this is much more challenging in an MAV in a swarm robotic system due to limited computational resources. In vision-based pose estimation, outlier detection is the most time-consuming step. This usually involves a RANSAC procedure using the reprojection-error method for hypothesis evaluation. Realignment-based hypothesis evaluation method is observed to be more accurate, but the considerably slower speed makes it unsuitable for robots with limited resources. We use sufficient statistics of least-squares minimisation to speed up this process. The additive nature of these sufficient statistics makes it possible to compute pose estimates in each evaluation by reusing previously computed statistics. Thus estimates need not be calculated from scratch each time. The proposed method is tested on standard RANSAC, Preemptive RANSAC and R-RANSAC using benchmark datasets. The results show that the use of sufficient statistics speeds up the outlier detection process with realignment hypothesis evaluation for all RANSAC variants, achieving an execution speed of up to 6.72 times.
- Authors: Senthooran, Ilankalkone , Murshed, Manzur , Barca, Jan , Kamruzzaman, Joarder , Chung, Hoam
- Date: 2019
- Type: Text , Journal article
- Relation: Autonomous Robots Vol. 43, no. 5 (2019), p. 1257-1270
- Full Text:
- Reviewed:
- Description: Achieving autonomous flight in GPS-denied environments begins with pose estimation in three-dimensional space, and this is much more challenging in an MAV in a swarm robotic system due to limited computational resources. In vision-based pose estimation, outlier detection is the most time-consuming step. This usually involves a RANSAC procedure using the reprojection-error method for hypothesis evaluation. Realignment-based hypothesis evaluation method is observed to be more accurate, but the considerably slower speed makes it unsuitable for robots with limited resources. We use sufficient statistics of least-squares minimisation to speed up this process. The additive nature of these sufficient statistics makes it possible to compute pose estimates in each evaluation by reusing previously computed statistics. Thus estimates need not be calculated from scratch each time. The proposed method is tested on standard RANSAC, Preemptive RANSAC and R-RANSAC using benchmark datasets. The results show that the use of sufficient statistics speeds up the outlier detection process with realignment hypothesis evaluation for all RANSAC variants, achieving an execution speed of up to 6.72 times.
Hierarchical colour image segmentation by leveraging RGB channels independently
- Tania, Sheikh, Murshed, Manzur, Teng, Shyh, Karmakar, Gour
- Authors: Tania, Sheikh , Murshed, Manzur , Teng, Shyh , Karmakar, Gour
- Date: 2019
- Type: Text , Conference paper
- Relation: 9th Pacific-Rim Symposium on Image and Video Technology, PSIVT 2019 Vol. 11854 LNCS, p. 197-210
- Full Text:
- Reviewed:
- Description: In this paper, we introduce a hierarchical colour image segmentation based on cuboid partitioning using simple statistical features of the pixel intensities in the RGB channels. Estimating the difference between any two colours is a challenging task. As most of the colour models are not perceptually uniform, investigation of an alternative strategy is highly demanding. To address this issue, for our proposed technique, we present a new concept for colour distance measure based on the inconsistency of pixel intensities of an image which is more compliant to human perception. Constructing a reliable set of superpixels from an image is fundamental for further merging. As cuboid partitioning is a superior candidate to produce superpixels, we use the agglomerative merging to yield the final segmentation results exploiting the outcome of our proposed cuboid partitioning. The proposed cuboid segmentation based algorithm significantly outperforms not only the quadtree-based segmentation but also existing state-of-the-art segmentation algorithms in terms of quality of segmentation for the benchmark datasets used in image segmentation. © 2019, Springer Nature Switzerland AG.
- Authors: Tania, Sheikh , Murshed, Manzur , Teng, Shyh , Karmakar, Gour
- Date: 2019
- Type: Text , Conference paper
- Relation: 9th Pacific-Rim Symposium on Image and Video Technology, PSIVT 2019 Vol. 11854 LNCS, p. 197-210
- Full Text:
- Reviewed:
- Description: In this paper, we introduce a hierarchical colour image segmentation based on cuboid partitioning using simple statistical features of the pixel intensities in the RGB channels. Estimating the difference between any two colours is a challenging task. As most of the colour models are not perceptually uniform, investigation of an alternative strategy is highly demanding. To address this issue, for our proposed technique, we present a new concept for colour distance measure based on the inconsistency of pixel intensities of an image which is more compliant to human perception. Constructing a reliable set of superpixels from an image is fundamental for further merging. As cuboid partitioning is a superior candidate to produce superpixels, we use the agglomerative merging to yield the final segmentation results exploiting the outcome of our proposed cuboid partitioning. The proposed cuboid segmentation based algorithm significantly outperforms not only the quadtree-based segmentation but also existing state-of-the-art segmentation algorithms in terms of quality of segmentation for the benchmark datasets used in image segmentation. © 2019, Springer Nature Switzerland AG.
Improved image analysis methodology for detecting changes in evidence positioning at crime scenes
- Petty, Mark, Teng, Shyh, Murshed, Manzur
- Authors: Petty, Mark , Teng, Shyh , Murshed, Manzur
- Date: 2019
- Type: Text , Conference proceedings , Conference paper
- Relation: 2019 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2019
- Full Text:
- Reviewed:
- Description: This paper proposed an improved methodology to assist forensic investigators in detecting positional change of objects due to crime scene contamination. Either intentionally or by accident, crime scene contamination can occur during the investigation and documentation process. This new proposed methodology utilises an ASIFT-based feature detection algorithm that compares pre- and post-contaminated images of the same scene, taken from different viewpoints. The contention is that the ASIFT registration technique is better suited to real world crime scene photography, being more robust to affine distortion that occurs when capturing images from different viewpoints. The proposed methodology was tested with both the SIFT and ASIFT registration techniques to show that (1) it could identify missing, planted and displaced objects using both SIFT and ASIFT and (2) ASIFT is superior to SIFT in terms of error in displacement estimation, especially for larger viewpoint discrepancies between the pre- and post-contamination images. This supports the contention that our proposed methodology in combination with ASIFT is better suited to handle real world crime scene photography. © 2019 IEEE.
- Description: E1
- Authors: Petty, Mark , Teng, Shyh , Murshed, Manzur
- Date: 2019
- Type: Text , Conference proceedings , Conference paper
- Relation: 2019 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2019
- Full Text:
- Reviewed:
- Description: This paper proposed an improved methodology to assist forensic investigators in detecting positional change of objects due to crime scene contamination. Either intentionally or by accident, crime scene contamination can occur during the investigation and documentation process. This new proposed methodology utilises an ASIFT-based feature detection algorithm that compares pre- and post-contaminated images of the same scene, taken from different viewpoints. The contention is that the ASIFT registration technique is better suited to real world crime scene photography, being more robust to affine distortion that occurs when capturing images from different viewpoints. The proposed methodology was tested with both the SIFT and ASIFT registration techniques to show that (1) it could identify missing, planted and displaced objects using both SIFT and ASIFT and (2) ASIFT is superior to SIFT in terms of error in displacement estimation, especially for larger viewpoint discrepancies between the pre- and post-contamination images. This supports the contention that our proposed methodology in combination with ASIFT is better suited to handle real world crime scene photography. © 2019 IEEE.
- Description: E1
A novel no-reference subjective quality metric for free viewpoint video using human eye movement
- Podder, Pallab, Paul, Manoranjan, Murshed, Manzur
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 8th Pacific-Rim Symposium on Image and Video Technology, PSIVT 2017; Wuhan, China; 20th-24th November 2017; published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 10749 LNCS, p. 237-251
- Full Text:
- Reviewed:
- Description: The free viewpoint video (FVV) allows users to interactively control the viewpoint and generate new views of a dynamic scene from any 3D position for better 3D visual experience with depth perception. Multiview video coding exploits both texture and depth video information from various angles to encode a number of views to facilitate FVV. The usual practice for the single view or multiview quality assessment is characterized by evolving the objective quality assessment metrics due to their simplicity and real time applications such as the peak signal-to-noise ratio (PSNR) or the structural similarity index (SSIM). However, the PSNR or SSIM requires reference image for quality evaluation and could not be successfully employed in FVV as the new view in FVV does not have any reference view to compare with. Conversely, the widely used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain knowledge, and many other factors that may actively influence on actual assessment. To address this limitation, in this work, we devise a no-reference subjective quality assessment metric by simply exploiting the pattern of human eye browsing on FVV. Over different quality contents of FVV, the participants eye-tracker recorded spatio-temporal gaze-data indicate more concentrated eye-traversing approach for relatively better quality. Thus, we calculate the Length, Angle, Pupil-size, and Gaze-duration features from the recorded gaze trajectory. The content and resolution invariant operation is carried out prior to synthesizing them using an adaptive weighted function to develop a new quality metric using eye traversal (QMET). Tested results reveal that the proposed QMET performs better than the SSIM and MOS in terms of assessing different aspects of coded video quality for a wide range of FVV contents.
- Description: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 8th Pacific-Rim Symposium on Image and Video Technology, PSIVT 2017; Wuhan, China; 20th-24th November 2017; published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 10749 LNCS, p. 237-251
- Full Text:
- Reviewed:
- Description: The free viewpoint video (FVV) allows users to interactively control the viewpoint and generate new views of a dynamic scene from any 3D position for better 3D visual experience with depth perception. Multiview video coding exploits both texture and depth video information from various angles to encode a number of views to facilitate FVV. The usual practice for the single view or multiview quality assessment is characterized by evolving the objective quality assessment metrics due to their simplicity and real time applications such as the peak signal-to-noise ratio (PSNR) or the structural similarity index (SSIM). However, the PSNR or SSIM requires reference image for quality evaluation and could not be successfully employed in FVV as the new view in FVV does not have any reference view to compare with. Conversely, the widely used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain knowledge, and many other factors that may actively influence on actual assessment. To address this limitation, in this work, we devise a no-reference subjective quality assessment metric by simply exploiting the pattern of human eye browsing on FVV. Over different quality contents of FVV, the participants eye-tracker recorded spatio-temporal gaze-data indicate more concentrated eye-traversing approach for relatively better quality. Thus, we calculate the Length, Angle, Pupil-size, and Gaze-duration features from the recorded gaze trajectory. The content and resolution invariant operation is carried out prior to synthesizing them using an adaptive weighted function to develop a new quality metric using eye traversal (QMET). Tested results reveal that the proposed QMET performs better than the SSIM and MOS in terms of assessing different aspects of coded video quality for a wide range of FVV contents.
- Description: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Cuboid colour image segmentation using intuitive distance measure
- Tania, Sheikh, Murshed, Manzur, Teng, Shyh, Karmakar, Gour
- Authors: Tania, Sheikh , Murshed, Manzur , Teng, Shyh , Karmakar, Gour
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 Vol. 2018-November, p. 1-6
- Full Text:
- Reviewed:
- Description: In this paper, an improved algorithm for cuboid image segmentation is proposed. To address the two main limitations of the recently proposed cuboid segmentation algorithm, the improved algorithm substitutes colour quantization in HCL colour space with infinity norm distance in RGB colour space along with a different way to impose area thresholding. We also propose a new metric to evaluate the quality of segmentation. Experimental results show that the proposed cuboid segmentation algorithm significantly outperforms the existing cuboid segmentation algorithm in terms of quality of segmentation.
- Description: International Conference Image and Vision Computing New Zealand
- Authors: Tania, Sheikh , Murshed, Manzur , Teng, Shyh , Karmakar, Gour
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 Vol. 2018-November, p. 1-6
- Full Text:
- Reviewed:
- Description: In this paper, an improved algorithm for cuboid image segmentation is proposed. To address the two main limitations of the recently proposed cuboid segmentation algorithm, the improved algorithm substitutes colour quantization in HCL colour space with infinity norm distance in RGB colour space along with a different way to impose area thresholding. We also propose a new metric to evaluate the quality of segmentation. Experimental results show that the proposed cuboid segmentation algorithm significantly outperforms the existing cuboid segmentation algorithm in terms of quality of segmentation.
- Description: International Conference Image and Vision Computing New Zealand
Detecting splicing and copy-move attacks in color images
- Islam, Mohammad, Karmakar, Gour, Kamruzzaman, Joarder, Murshed, Manzur, Kahandawa, Gayan, Parvin, Nahida
- Authors: Islam, Mohammad , Karmakar, Gour , Kamruzzaman, Joarder , Murshed, Manzur , Kahandawa, Gayan , Parvin, Nahida
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018; Canberra, Australia; 10th-13th December 2018 p. 1-7
- Full Text:
- Reviewed:
- Description: Image sensors are generating limitless digital images every day. Image forgery like splicing and copy-move are very common type of attacks that are easy to execute using sophisticated photo editing tools. As a result, digital forensics has attracted much attention to identify such tampering on digital images. In this paper, a passive (blind) image tampering identification method based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) has been proposed. First, the chroma components of an image is divided into fixed sized non-overlapping blocks and 2D block DCT is applied to identify the changes due to forgery in local frequency distribution of the image. Then a texture descriptor, LBP is applied on the magnitude component of the 2D-DCT array to enhance the artifacts introduced by the tampering operation. The resulting LBP image is again divided into non-overlapping blocks. Finally, summations of corresponding inter-cell values of all the LBP blocks are computed and arranged as a feature vector. These features are fed into a Support Vector Machine (SVM) with Radial Basis Function (RBF) as kernel to distinguish forged images from authentic ones. The proposed method has been experimented extensively on three publicly available well-known image splicing and copy-move detection benchmark datasets of color images. Results demonstrate the superiority of the proposed method over recently proposed state-of-the-art approaches in terms of well accepted performance metrics such as accuracy, area under ROC curve and others.
- Description: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
- Authors: Islam, Mohammad , Karmakar, Gour , Kamruzzaman, Joarder , Murshed, Manzur , Kahandawa, Gayan , Parvin, Nahida
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018; Canberra, Australia; 10th-13th December 2018 p. 1-7
- Full Text:
- Reviewed:
- Description: Image sensors are generating limitless digital images every day. Image forgery like splicing and copy-move are very common type of attacks that are easy to execute using sophisticated photo editing tools. As a result, digital forensics has attracted much attention to identify such tampering on digital images. In this paper, a passive (blind) image tampering identification method based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) has been proposed. First, the chroma components of an image is divided into fixed sized non-overlapping blocks and 2D block DCT is applied to identify the changes due to forgery in local frequency distribution of the image. Then a texture descriptor, LBP is applied on the magnitude component of the 2D-DCT array to enhance the artifacts introduced by the tampering operation. The resulting LBP image is again divided into non-overlapping blocks. Finally, summations of corresponding inter-cell values of all the LBP blocks are computed and arranged as a feature vector. These features are fed into a Support Vector Machine (SVM) with Radial Basis Function (RBF) as kernel to distinguish forged images from authentic ones. The proposed method has been experimented extensively on three publicly available well-known image splicing and copy-move detection benchmark datasets of color images. Results demonstrate the superiority of the proposed method over recently proposed state-of-the-art approaches in terms of well accepted performance metrics such as accuracy, area under ROC curve and others.
- Description: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
Efficient video coding using visual sensitive information for HEVC coding standard
- Podder, Pallab, Paul, Manoranjan, Murshed, Manzur
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2018
- Type: Text , Journal article
- Relation: IEEE Access Vol. 6, no. (2018), p. 75695-75708
- Full Text:
- Reviewed:
- Description: The latest high efficiency video coding (HEVC) standard introduces a large number of inter-mode block partitioning modes. The HEVC reference test model (HM) uses partially exhaustive tree-structured mode selection, which still explores a large number of prediction unit (PU) modes for a coding unit (CU). This impacts on encoding time rise which deprives a number of electronic devices having limited processing resources to use various features of HEVC. By analyzing the homogeneity, residual, and different statistical correlation among modes, many researchers speed-up the encoding process through the number of PU mode reduction. However, these approaches could not demonstrate the similar rate-distortion (RD) performance with the HM due to their dependency on existing Lagrangian cost function (LCF) within the HEVC framework. In this paper, to avoid the complete dependency on LCF in the initial phase, we exploit visual sensitive foreground motion and spatial salient metric (FMSSM) in a block. To capture its motion and saliency features, we use the dynamic background and visual saliency modeling, respectively. According to the FMSSM values, a subset of PU modes is then explored for encoding the CU. This preprocessing phase is independent from the existing LCF. As the proposed coding technique further reduces the number of PU modes using two simple criteria (i.e., motion and saliency), it outperforms the HM in terms of encoding time reduction. As it also encodes the uncovered and static background areas using the dynamic background frame as a substituted reference frame, it does not sacrifice quality. Tested results reveal that the proposed method achieves 32% average encoding time reduction of the HM without any quality loss for a wide range of videos.
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2018
- Type: Text , Journal article
- Relation: IEEE Access Vol. 6, no. (2018), p. 75695-75708
- Full Text:
- Reviewed:
- Description: The latest high efficiency video coding (HEVC) standard introduces a large number of inter-mode block partitioning modes. The HEVC reference test model (HM) uses partially exhaustive tree-structured mode selection, which still explores a large number of prediction unit (PU) modes for a coding unit (CU). This impacts on encoding time rise which deprives a number of electronic devices having limited processing resources to use various features of HEVC. By analyzing the homogeneity, residual, and different statistical correlation among modes, many researchers speed-up the encoding process through the number of PU mode reduction. However, these approaches could not demonstrate the similar rate-distortion (RD) performance with the HM due to their dependency on existing Lagrangian cost function (LCF) within the HEVC framework. In this paper, to avoid the complete dependency on LCF in the initial phase, we exploit visual sensitive foreground motion and spatial salient metric (FMSSM) in a block. To capture its motion and saliency features, we use the dynamic background and visual saliency modeling, respectively. According to the FMSSM values, a subset of PU modes is then explored for encoding the CU. This preprocessing phase is independent from the existing LCF. As the proposed coding technique further reduces the number of PU modes using two simple criteria (i.e., motion and saliency), it outperforms the HM in terms of encoding time reduction. As it also encodes the uncovered and static background areas using the dynamic background frame as a substituted reference frame, it does not sacrifice quality. Tested results reveal that the proposed method achieves 32% average encoding time reduction of the HM without any quality loss for a wide range of videos.
Enhanced colour image retrieval with cuboid segmentation
- Murshed, Manzur, Karmakar, Priyabrata, Teng, Shyh, Lu, Guojun
- Authors: Murshed, Manzur , Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018; Canberra, Australia; 10th-13th December 2018
- Full Text:
- Reviewed:
- Description: In this paper, we further investigate our recently proposed cuboid image segmentation algorithm for effective image retrieval. Instead of using all cuboids (i.e. segments), we have proposed two approaches to choose different subsets of cuboids appropriately. With the experimental results on eBay dataset, we have shown that our proposals outperform retrieval performance of the existing technique. In addition, we have investigated how many segments are required for the most effective image retrieval and provide a quick method to determine the suitable number of cuboids.
- Description: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
- Authors: Murshed, Manzur , Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018; Canberra, Australia; 10th-13th December 2018
- Full Text:
- Reviewed:
- Description: In this paper, we further investigate our recently proposed cuboid image segmentation algorithm for effective image retrieval. Instead of using all cuboids (i.e. segments), we have proposed two approaches to choose different subsets of cuboids appropriately. With the experimental results on eBay dataset, we have shown that our proposals outperform retrieval performance of the existing technique. In addition, we have investigated how many segments are required for the most effective image retrieval and provide a quick method to determine the suitable number of cuboids.
- Description: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
Exploiting user provided information in dynamic consolidation of virtual machines to minimize energy consumption of cloud data centers
- Khan, Anit, Paplinski, Andrew, Khan, Abdul, Murshed, Manzur, Buyya, Rajkumar
- Authors: Khan, Anit , Paplinski, Andrew , Khan, Abdul , Murshed, Manzur , Buyya, Rajkumar
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 3rd International Conference on Fog and Mobile Edge Computing, FMEC 2018; Barcelona, Spain; 23rd-26th April 2018; p. 105-114
- Full Text:
- Reviewed:
- Description: Dynamic consolidation of Virtual Machines (VMs) can effectively enhance the resource utilization and energy-efficiency of the Cloud Data Centers (CDC). Existing research on Cloud resource reservation and scheduling signify that Cloud Service Users (CSUs) can play a crucial role in improving the resource utilization by providing valuable information to Cloud service providers. However, utilization of CSUs' provided information in minimization of energy consumption of CDC is a novel research direction. The challenges herein are twofold. First, finding the right benign information to be received from a CSU which can complement the energy-efficiency of CDC. Second, smart application of such information to significantly reduce the energy consumption of CDC. To address those research challenges, we have proposed a novel heuristic Dynamic VM Consolidation algorithm, RTDVMC, which minimizes the energy consumption of CDC through exploiting CSU provided information. Our research exemplifies the fact that if VMs are dynamically consolidated based on the time when a VM can be removed from CDC-a useful information to be received from respective CSU, then more physical machines can be turned into sleep state, yielding lower energy consumption. We have simulated the performance of RTDVMC with real Cloud workload traces originated from more than 800 PlanetLab VMs. The empirical figures affirm the superiority of RTDVMC over existing prominent Static and Adaptive Threshold based DVMC algorithms.
- Authors: Khan, Anit , Paplinski, Andrew , Khan, Abdul , Murshed, Manzur , Buyya, Rajkumar
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 3rd International Conference on Fog and Mobile Edge Computing, FMEC 2018; Barcelona, Spain; 23rd-26th April 2018; p. 105-114
- Full Text:
- Reviewed:
- Description: Dynamic consolidation of Virtual Machines (VMs) can effectively enhance the resource utilization and energy-efficiency of the Cloud Data Centers (CDC). Existing research on Cloud resource reservation and scheduling signify that Cloud Service Users (CSUs) can play a crucial role in improving the resource utilization by providing valuable information to Cloud service providers. However, utilization of CSUs' provided information in minimization of energy consumption of CDC is a novel research direction. The challenges herein are twofold. First, finding the right benign information to be received from a CSU which can complement the energy-efficiency of CDC. Second, smart application of such information to significantly reduce the energy consumption of CDC. To address those research challenges, we have proposed a novel heuristic Dynamic VM Consolidation algorithm, RTDVMC, which minimizes the energy consumption of CDC through exploiting CSU provided information. Our research exemplifies the fact that if VMs are dynamically consolidated based on the time when a VM can be removed from CDC-a useful information to be received from respective CSU, then more physical machines can be turned into sleep state, yielding lower energy consumption. We have simulated the performance of RTDVMC with real Cloud workload traces originated from more than 800 PlanetLab VMs. The empirical figures affirm the superiority of RTDVMC over existing prominent Static and Adaptive Threshold based DVMC algorithms.