Efficient coding of depth map by exploiting temporal correlation
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2014
- Type: Text , Conference proceedings
- Relation: 2014 International Conference on Digital Image Computing : Techniques and Applications (DICTA); Wollongong, Australia; 25th-27th November 2014
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text: false
- Description: With the growing demands for 3D and multi-view video content, efficient depth data coding becomes a vital issue in image and video coding area. In this paper, we propose a simple depth coding scheme using multiple prediction modes exploiting temporal correlation of depth map. Current depth coding techniques mostly depend on intra-coding mode that cannot get the advantage of temporal redundancy in the depth maps and higher spatial redundancy in inter-predicted depth residuals. Depth maps are characterized by smooth regions with sharp edges that play an important role in the view synthesis process. As depth maps are more sensitive to coding errors, use of transformation or approximation of edges by explicit edge modelling has impact on view synthesis quality. Moreover, lossy compression of depth map brings additional geometrical distortion to synthetic view. In this paper, we have demonstrated that encoding inter-coded depth block residuals with quantization at pixel domain is more efficient than the intra-coding techniques relying on explicit edge preservation. On standard 3D video sequences, the proposed depth coding has achieved superior image quality of synthesized views against the new 3D-HEVC standard for depth map bit-rate 0.25 bpp or higher.
Exploiting user provided information in dynamic consolidation of virtual machines to minimize energy consumption of cloud data centers
- Authors: Khan, Anit , Paplinski, Andrew , Khan, Abdul , Murshed, Manzur , Buyya, Rajkumar
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 3rd International Conference on Fog and Mobile Edge Computing, FMEC 2018; Barcelona, Spain; 23rd-26th April 2018; p. 105-114
- Full Text:
- Reviewed:
- Description: Dynamic consolidation of Virtual Machines (VMs) can effectively enhance the resource utilization and energy-efficiency of the Cloud Data Centers (CDC). Existing research on Cloud resource reservation and scheduling signify that Cloud Service Users (CSUs) can play a crucial role in improving the resource utilization by providing valuable information to Cloud service providers. However, utilization of CSUs' provided information in minimization of energy consumption of CDC is a novel research direction. The challenges herein are twofold. First, finding the right benign information to be received from a CSU which can complement the energy-efficiency of CDC. Second, smart application of such information to significantly reduce the energy consumption of CDC. To address those research challenges, we have proposed a novel heuristic Dynamic VM Consolidation algorithm, RTDVMC, which minimizes the energy consumption of CDC through exploiting CSU provided information. Our research exemplifies the fact that if VMs are dynamically consolidated based on the time when a VM can be removed from CDC-a useful information to be received from respective CSU, then more physical machines can be turned into sleep state, yielding lower energy consumption. We have simulated the performance of RTDVMC with real Cloud workload traces originated from more than 800 PlanetLab VMs. The empirical figures affirm the superiority of RTDVMC over existing prominent Static and Adaptive Threshold based DVMC algorithms.
Cuboid segmentation for effective image retrieval
- Authors: Murshed, Manzur , Teng, Shyh , Lu, Guojun
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing : Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 884-891
- Full Text: false
- Reviewed:
- Description: Region-based image retrieval has been proven to be effective in finding relevant images. In this paper, we propose a cuboid im-age segmentation method which results in rectangle image partitions. Rectangle partitions are more suitable for image compression, retrieval and other image operations. We apply partitions in image retrieval in this paper. Our experimental results have shown that (1) the proposed partitioning method is effective in segmenting images into meaningful rectangles; (2) using colour partitions for image retrieval is more effective than using whole images; and (3) the partitioned approach has additional advantage of letting users to select certain objects/colours as queries to find more relevant images/objects. These three advantages could be important in crime scene investigation image indexing and retrieval. Moreover, the proposed technique is amenable to compressed-domain applications.
Lossless image coding using binary tree decomposition of prediction residuals
- Authors: Ali, Mortuza , Murshed, Manzur , Shahriyar, Shampa , Paul, Manoranjan
- Date: 2015
- Type: Text , Conference proceedings
- Full Text: false
- Description: State-of-the-art lossless image compression schemes, such as, JPEG-LS and CALIC, have been proposed in the context adaptive predictive coding framework. These schemes involve a prediction step followed by context adaptive entropy coding of the residuals. It can be observed that there exist significant spatial correlation among the residuals after prediction. The efficient schemes proposed in the literature rely on context adaptive entropy coding to exploit this spatial correlation. In this paper, we propose an alternative approach to exploit this spatial correlation. The proposed scheme also involves a prediction stage. However, we resort to a binary tree based hierarchical decomposition technique to efficiently exploit the spatial correlation. On a set of standard test images, the proposed scheme, using the same predictor as JPEG-LS, achieved an overall compression gain of 2.1% against JPEG-LS. © 2015 IEEE.
Efficient video coding using visual sensitive information for HEVC coding standard
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2018
- Type: Text , Journal article
- Relation: IEEE Access Vol. 6, no. (2018), p. 75695-75708
- Full Text:
- Reviewed:
- Description: The latest high efficiency video coding (HEVC) standard introduces a large number of inter-mode block partitioning modes. The HEVC reference test model (HM) uses partially exhaustive tree-structured mode selection, which still explores a large number of prediction unit (PU) modes for a coding unit (CU). This impacts on encoding time rise which deprives a number of electronic devices having limited processing resources to use various features of HEVC. By analyzing the homogeneity, residual, and different statistical correlation among modes, many researchers speed-up the encoding process through the number of PU mode reduction. However, these approaches could not demonstrate the similar rate-distortion (RD) performance with the HM due to their dependency on existing Lagrangian cost function (LCF) within the HEVC framework. In this paper, to avoid the complete dependency on LCF in the initial phase, we exploit visual sensitive foreground motion and spatial salient metric (FMSSM) in a block. To capture its motion and saliency features, we use the dynamic background and visual saliency modeling, respectively. According to the FMSSM values, a subset of PU modes is then explored for encoding the CU. This preprocessing phase is independent from the existing LCF. As the proposed coding technique further reduces the number of PU modes using two simple criteria (i.e., motion and saliency), it outperforms the HM in terms of encoding time reduction. As it also encodes the uncovered and static background areas using the dynamic background frame as a substituted reference frame, it does not sacrifice quality. Tested results reveal that the proposed method achieves 32% average encoding time reduction of the HM without any quality loss for a wide range of videos.
Fast intermode selection for HEVC video coding using phase correlation
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur , Chakraborty, Subrata
- Date: 2015
- Type: Text , Conference proceedings , Conference paper
- Relation: 2014 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2014; Wollongong, Australia; 25th-27th November 2014 p. 1-8
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: The recent High Efficiency Video Coding (HEVC) Standard demonstrates higher rate-distortion (RD) performance compared to its predecessor H.264/AVC using different new tools especially larger and asymmetric inter-mode variable size motion estimation and compensation. This requires more than 4 times computational time compared to H.264/AVC. As a result it has always been a big concern for the researchers to reduce the amount of time while maintaining the standard quality of the video. The reduction of computational time by smart selection of the appropriate modes in HEVC is our motivation. To accomplish this task in this paper, we use phase correlation to approximate the motion information between current and reference blocks by comparing with a number of different binary pattern templates and then select a subset of motion estimation modes without exhaustively exploring all possible modes. The experimental results exhibit that the proposed HEVC-PC (HEVC with Phase Correlation) scheme outperforms the standard HEVC scheme in terms of computational time while preserving-the same quality of the video sequences. More specifically, around 40% encoding time is reduced compared to the exhaustive mode selection in HEVC. © 2014 IEEE.
- Description: 2014 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2014
A novel quality metric using spatiotemporal correlational data of human eye maneuver
- Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing : Techniques and Applications, DICTA 2017; Sydney, Australia; 29th November-1st December 2017 Vol. 2017-December, p. 1-8
- Full Text:
- Reviewed:
- Description: The popularly used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain expertise, and many other factors that may actively influence on actual assessment. We therefore, devise a no- reference subjective quality assessment metric by exploiting the nature of human eye browsing on videos. The participants' eye-tracker recorded gaze-data indicate more concentrated eye- traversing approach for relatively better quality. We calculate the Length, Angle, Pupil-size, and Gaze-duration features from the recorded gaze trajectory. The content and resolution invariant operation is carried out prior to synthesizing them using an adaptive weighted function to develop a new quality metric using eye traversal (QMET). Tested results reveal that the quality evaluation carried out by QMET demonstrates a strong correlation with the most widely used peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and the MOS.
- Description: DICTA 2017 - 2017 International Conference on Digital Image Computing: Techniques and Applications
Measuring trustworthiness of IoT image sensor data using other sensors' complementary multimodal data
- Authors: Islam, Mohammad , Karmakar, Gour , Kamruzzaman, Joarder , Murshed, Manzur
- Date: 2019
- Type: Text , Conference proceedings , Conference paper
- Relation: 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science and Engineering, TrustCom/BigDataSE 2019 p. 775-780
- Full Text: false
- Reviewed:
- Description: Trust of image sensor data is becoming increasingly important as the Internet of Things (IoT) applications grow from home appliances to surveillance. Up to our knowledge, there exists only one work in literature that estimates trustworthiness of digital images applied to forensic applications, based on a machine learning technique. The efficacy of this technique is heavily dependent on availability of an appropriate training set and adequate variation of IoT sensor data with noise, interference and environmental condition, but availability of such data cannot be assured always. Therefore, to overcome this limitation, a robust method capable of estimating trustworthy measure with high accuracy is needed. Lowering cost of sensors allow many IoT applications to use multiple types of sensors to observe the same event. In such cases, complementary multimodal data of one sensor can be exploited to measure trust level of another sensor data. In this paper, for the first time, we introduce a completely new approach to estimate the trustworthiness of an image sensor data using another sensor's numerical data. We develop a theoretical model using the Dempster-Shafer theory (DST) framework. The efficacy of the proposed model in estimating trust level of an image sensor data is analyzed by observing a fire event using IoT image and temperature sensor data in a residential setup under different scenarios. The proposed model produces highly accurate trust level in all scenarios with authentic and forged image data. © 2019 IEEE.
- Description: E1
A novel depth motion vector coding exploiting spatial and inter-component clustering tendency
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2015
- Type: Text , Conference proceedings , Conference paper
- Relation: Visual Communications and Image Processing, VCIP 2015; Singapore; 13th-16th December 2015 p. 1-4
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text: false
- Reviewed:
- Description: Motion vectors of depth-maps in multiview and free-viewpoint videos exhibit strong spatial as well as inter-component clustering tendency. This paper presents a novel coding technique that first compresses the multidimensional bitmaps of macroblock mode and then encodes only the non-zero components of motion vectors. The bitmaps are partitioned into disjoint cuboids using binary tree based decomposition so that the 0's and 1's are either highly polarized or further sub-partitioning is unlikely to achieve any compression. Each cuboid is entropy-coded as a unit using binary arithmetic coding. This technique is capable of exploiting the spatial and inter-component correlations efficiently without the restriction of scanning the bitmap in any specific linear order as needed by run-length coding. As encoding of non-zero component values no longer requires denoting the zero value, further compression efficiency is achieved. Experimental results on standard multiview test video sequences have comprehensively demonstrated the superiority of the proposed technique, achieving overall coding gain against the state-of-the-art in the range [22%, 54%] and on average 38%. © 2015 IEEE.
- Description: 2015 Visual Communications and Image Processing, VCIP 2015
An efficient cooperative lane-changing algorithm for sensor- and communication-enabled automated vehicles
- Authors: Awal, Tanveer , Murshed, Manzur , Ali, Mortuza
- Date: 2015
- Type: Text , Conference proceedings
- Full Text: false
- Description: A key goal in transportation system is to attain efficient road traffic through minimization of trip time, fuel consumption and pollutant-emission without compromising safety. In dense traffic lane-changes and merging are often key ingredients to cause safety hazards, traffic breakdowns and travel delays. In this paper, we propose an efficient cooperative lane-changing algorithm CLA for sensor- and communication-enabled automated vehicles to reduce the lane-changing bottlenecks. For discretionary lane-changing, we consider the advantages of the subject vehicle, the follower in the current lane and k (an integer) lag vehicles in the target lane to maximize speed gains. Our algorithm simultaneously minimizes the impact of lane-change on traffic flow and the overall trip time, fuel-consumption and pollutant-emission. For mandatory lane-changing CLA dissociates the decision-making point from the actual mandatory lane-changing point and computes a suitable lane-changing slot in order to minimize lane-changing (merging) time. Our algorithm outperforms the potential cooperative lane-changing algorithm MOBIL proposed by Kesting et al. [1] in terms of merging time and rate, waiting time, fuel consumption, average velocity and flow (especially at the point in front of the merging point) at the cost of slightly increased average trip time for the mainroad vehicles compared to MOBIL. We also highlight important directions for further research. © 2015 IEEE.
Depth sequence coding with hierarchical partitioning and spatial-domain quantization
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 30, no. 3 (2020), p. 835-849
- Full Text:
- Reviewed:
- Description: Depth coding in 3D-HEVC deforms object shapes due to block-level edge-approximation and lacks efficient techniques to exploit the statistical redundancy, due to the frame-level clustering tendency in depth data, for higher coding gain at near-lossless quality. This paper presents a standalone mono-view depth sequence coder, which preserves edges implicitly by limiting quantization to the spatial-domain and exploits the frame-level clustering tendency efficiently with a novel binary tree-based decomposition (BTBD) technique. The BTBD can exploit the statistical redundancy in frame-level syntax, motion components, and residuals efficiently with fewer block-level prediction/coding modes and simpler context modeling for context-adaptive arithmetic coding. Compared with the depth coder in 3D-HEVC, the proposed one has achieved significantly lower bitrate at lossless to near-lossless quality range for mono-view coding and rendered superior quality synthetic views from the depth maps, compressed at the same bitrate, and the corresponding texture frames. © 1991-2012 IEEE.
Improved image analysis methodology for detecting changes in evidence positioning at crime scenes
- Authors: Petty, Mark , Teng, Shyh , Murshed, Manzur
- Date: 2019
- Type: Text , Conference proceedings , Conference paper
- Relation: 2019 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2019
- Full Text:
- Reviewed:
- Description: This paper proposed an improved methodology to assist forensic investigators in detecting positional change of objects due to crime scene contamination. Either intentionally or by accident, crime scene contamination can occur during the investigation and documentation process. This new proposed methodology utilises an ASIFT-based feature detection algorithm that compares pre- and post-contaminated images of the same scene, taken from different viewpoints. The contention is that the ASIFT registration technique is better suited to real world crime scene photography, being more robust to affine distortion that occurs when capturing images from different viewpoints. The proposed methodology was tested with both the SIFT and ASIFT registration techniques to show that (1) it could identify missing, planted and displaced objects using both SIFT and ASIFT and (2) ASIFT is superior to SIFT in terms of error in displacement estimation, especially for larger viewpoint discrepancies between the pre- and post-contamination images. This supports the contention that our proposed methodology in combination with ASIFT is better suited to handle real world crime scene photography. © 2019 IEEE.
- Description: E1
Efficient low bit-rate intra-frame coding using common information for 360-degree video
- Authors: Afsana, Fariha , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2020
- Type: Text , Conference paper
- Relation: 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020
- Full Text: false
- Reviewed:
- Description: With the growth of video technologies, super-resolution videos, including 360-degree immersive video has become a reality due to exciting applications such as augmented/virtual/mixed reality for better interaction and a wide-angle user-view experience of a scene compared to traditional video with narrow-focused viewing angle. The new generation video contents are bandwidth-intensive in nature due to high resolution and demand high bit rate as well as low latency delivery requirements that pose challenges in solving the bottleneck of transmission and storage burdens. There is limited optimisation space in traditional video coding schemes for improving video coding efficiency in intra-frame due to the fixed size of processing block. This paper presents a new approach for improving intra-frame coding especially at low bit rate video transmission for 360-degree video for lossy mode of HEVC. Prior to using traditional HEVC intra-prediction, this approach exploits the global redundancy of entire frame by extracting common important information using multi-level discrete wavelet transformation. This paper demonstrates that the proposed method considering only low frequency information of a frame and encoding this can outperform the HEVC standard at low bit rates. The experimental results indicate that the proposed intra-frame coding strategy achieves an average of 54.07% BD-rate reduction and 2.84 dB BD-PSNR gain for low bit rate scenario compared to the HEVC. It also achieves a significant improvement in encoding time reduction of about 66.84% on an average. Moreover, this finding also demonstrates that the existing HEVC block partitioning can be applied in the transform domain for better exploitation of information concentration as we applied HEVC on wavelet frequency domain. © 2020 IEEE.
A novel video coding scheme using a scene adaptive non-parametric background model
- Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
- Date: 2014
- Type: Text , Conference paper
- Relation: 16th IEEE International Workshop on Multimedia Signal Processing, MMSP 2014 p. 1-6
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text:
- Reviewed:
- Description: Video coding techniques utilising background frames, provide better rate distortion performance by exploiting coding efficiency in uncovered background areas compared to the latest video coding standard. Parametric approaches such as the mixture of Gaussian (MoG) based background modeling has been widely used however they require prior knowledge about the test videos for parameter estimation. Recently introduced non-parametric (NP) based background modeling techniques successfully improved video coding performance through a HEVC integrated coding scheme. The inherent nature of the NP technique naturally exhibits superior performance in dynamic background scenarios compared to the MoG based technique without a priori knowledge of video data distribution. Although NP based coding schemes showed promising coding performances, they suffer from a number of key challenges - (a) determination of the optimal subset of training frames for generating a suitable background that can be used as a reference frame during coding, (b) incorporating dynamic changes in the background effectively after the initial background frame is generated, (c) managing frequent scene change leading to performance degradation, and (d) optimizing coding quality ratio between an I-frame and other frames under bit rate constraints. In this study we develop a new scene adaptive coding scheme using the NP based technique, capable of solving the current challenges by incorporating a new continuously updating background generation process. Extensive experimental results are also provided to validate the effectiveness of the new scheme.
Efficient high-resolution video compression scheme using background and foreground layers
- Authors: Afsana, Fariha , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Access Vol. 9, no. (2021), p. 157411-157421
- Full Text:
- Reviewed:
- Description: Video coding using dynamic background frame achieves better compression compared to the traditional techniques by encoding background and foreground separately. This process reduces coding bits for the overall frame significantly; however, encoding background still requires many bits that can be compressed further for achieving better coding efficiency. The cuboid coding framework has been proven to be one of the most effective methods of image compression which exploits homogeneous pixel correlation within a frame and has better alignment with object boundary compared to traditional block-based coding. In a video sequence, the cuboid-based frame partitioning varies with the changes of the foreground. However, since the background remains static for a group of pictures, the cuboid coding exploits better spatial pixel homogeneity. In this work, the impact of cuboid coding on the background frame for high-resolution videos (Ultra-High-Definition (UHD) and 360-degree videos) is investigated using the multilayer framework of SHVC. After the cuboid partitioning, the method of coarse frame generation has been improved with a novel idea by keeping human-visual sensitive information. Unlike the traditional SHVC scheme, in the proposed method, cuboid coded background and the foreground are encoded in separate layers in an implicit manner. Simulation results show that the proposed video coding method achieves an average BD-Rate reduction of 26.69% and BD-PSNR gain of 1.51 dB against SHVC with significant encoding time reduction for both UHD and 360 videos. It also achieves an average of 13.88% BD-Rate reduction and 0.78 dB BD-PSNR gain compared to the existing relevant method proposed by X. Hoang Van. © 2013 IEEE.
Detecting splicing and copy-move attacks in color images
- Authors: Islam, Mohammad , Karmakar, Gour , Kamruzzaman, Joarder , Murshed, Manzur , Kahandawa, Gayan , Parvin, Nahida
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018; Canberra, Australia; 10th-13th December 2018 p. 1-7
- Full Text:
- Reviewed:
- Description: Image sensors are generating limitless digital images every day. Image forgery like splicing and copy-move are very common type of attacks that are easy to execute using sophisticated photo editing tools. As a result, digital forensics has attracted much attention to identify such tampering on digital images. In this paper, a passive (blind) image tampering identification method based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) has been proposed. First, the chroma components of an image is divided into fixed sized non-overlapping blocks and 2D block DCT is applied to identify the changes due to forgery in local frequency distribution of the image. Then a texture descriptor, LBP is applied on the magnitude component of the 2D-DCT array to enhance the artifacts introduced by the tampering operation. The resulting LBP image is again divided into non-overlapping blocks. Finally, summations of corresponding inter-cell values of all the LBP blocks are computed and arranged as a feature vector. These features are fed into a Support Vector Machine (SVM) with Radial Basis Function (RBF) as kernel to distinguish forged images from authentic ones. The proposed method has been experimented extensively on three publicly available well-known image splicing and copy-move detection benchmark datasets of color images. Results demonstrate the superiority of the proposed method over recently proposed state-of-the-art approaches in terms of well accepted performance metrics such as accuracy, area under ROC curve and others.
- Description: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
Adversarial network with multiple classifiers for open set domain adaptation
- Authors: Shermin, Tasfia , Lu, Guojun , Teng, Shyh , Murshed, Manzur , Sohel, Ferdous
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia Vol. 23, no. (2021), p. 2732-2744
- Full Text:
- Reviewed:
- Description: Domain adaptation aims to transfer knowledge from a domain with adequate labeled samples to a domain with scarce labeled samples. Prior research has introduced various open set domain adaptation settings in the literature to extend the applications of domain adaptation methods in real-world scenarios. This paper focuses on the type of open set domain adaptation setting where the target domain has both private ('unknown classes') label space and the shared ('known classes') label space. However, the source domain only has the 'known classes' label space. Prevalent distribution-matching domain adaptation methods are inadequate in such a setting that demands adaptation from a smaller source domain to a larger and diverse target domain with more classes. For addressing this specific open set domain adaptation setting, prior research introduces a domain adversarial model that uses a fixed threshold for distinguishing known from unknown target samples and lacks at handling negative transfers. We extend their adversarial model and propose a novel adversarial domain adaptation model with multiple auxiliary classifiers. The proposed multi-classifier structure introduces a weighting module that evaluates distinctive domain characteristics for assigning the target samples with weights which are more representative to whether they are likely to belong to the known and unknown classes to encourage positive transfers during adversarial training and simultaneously reduces the domain gap between the shared classes of the source and target domains. A thorough experimental investigation shows that our proposed method outperforms existing domain adaptation methods on a number of domain adaptation datasets. © 1999-2012 IEEE.
Detection of Malleefowl Mounds from Point Cloud Data
- Authors: Parvin, Nahida , Awrangjeb, Mohammad , Irvin, Marc , Florentine, Singarayer , Murshed, Manzur , Lu, Guojun
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2021, Gold Coast, 29 November to 1 December 2021
- Full Text: false
- Reviewed:
- Description: Airborne light detection and ranging (LiDAR) data have become cost and time-efficient means for estimating the size of timid fauna populations through the identification of artefacts that evidence their occurrence in a large, hostile geographic area. The unobtrusive detection method helps conservation managers to assess the stability of a population and to design appropriate conservation programs. Here we propose a mound (nest) detection method for Australia's native iconic bird, the Malleefowl, from point cloud data, which can be manipulated to act as a surrogate for population data. Existing detection methods are largely through manual observations, and are therefore not efficient for covering large and remote areas. The proposed mound detection method can identify mound feature based on height and intensity values provided by the point cloud data. Each candidate mound point is initially selected by applying a height threshold utilising the classified ground points and their corresponding digital elevation model (DEM). Then, another threshold based on intensity range derived from ground truth mound area analysis is applied on the extracted initial mound points to find the final candidate mound points. These extracted points are then used to generate a binary mask where the potential mound points are found sparse. To connect those points, a morphological filter is applied on the binary image and found the mound separated from other remaining non-mound objects. To obtain the mound from other non-mound objects, a morphological cleaning operation and a connected component analysis are carried out on the mask. The non-mound objects are removed from the mask utilising the area property of mound derived from the empirical analysis of ground-truth observations. Finally, the effectiveness of the proposed technique is calculated based on ground truth. Although the mound shapes and structures are highly variable in nature, our height and intensity-based mound point extraction method detected 55 % of the ground-truthed mounds. © 2021 IEEE.
A coarse representation of frames oriented video coding by leveraging cuboidal partitioning of image data
- Authors: Ahmmed, Ashe , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2020
- Type: Text , Conference paper
- Relation: 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020, Virtual Tampere, Finland 21-24 September 2020
- Full Text:
- Reviewed:
- Description: Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. In this work, we form a coarse representation of the current frame by minimizing commonality within that frame while preserving important structural properties of the frame. The building blocks of this coarse representation are rectangular regions called cuboids, which are computationally simple and has a compact description. Then we propose to employ the coarse frame as an additional source for predictive coding of the current frame. Experimental results show an improvement in bit rate savings over a reference codec for HEVC, with minor increase in the codec computational complexity. © 2020 IEEE.
Efficient scalable UHD/360-video coding by exploiting common information with cuboid-based partitioning
- Authors: Afsana, Fariha , Paul, Manoranjan , Murshed, Manzur , Taubman, David
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 32, no. 6 (2022), p. 3961-3977
- Full Text: false
- Reviewed:
- Description: The scalable extension of High Efficiency Video Coding, SHVC can code Ultra High-Definition (UHD) video, including 360-degree video for various devices to serve a single bitstream with different display resolutions and qualities. To improve the SHVC compression efficiency, this paper proposes a novel intra and inter-frame coding scheme by first separating the common/visually important information and then applying cuboid-based variable size block partitioning and coding process for the common/visually important information in the base layer. In cuboid-based partitioning a video frame is partitioned into arbitrary shaped rectangular regions, known as cuboids, based on the distribution of relatively homogeneous pixel values. As the cuboid adopts a variable block partitioning based on the homogeneity of the data value, the partitioned blocks have better alignment with the object boundary. Moreover, in the cuboid coding process, only the partitioning tree information and a single value for each block need to be coded which takes lower number of bits and computational time compared to the traditional SHVC base layer. To verify the performance of the proposed method we embedded the proposed scheme as a base layer into the standard SHVC reference software and used several popular UHD/360-degree videos. The experimental results indicate that the proposed scalable coding strategy achieves an average of 14.04% BD-Rate reduction and 0.61 dB BD-PSNR gain for UHD/360-video compared to the operation points provided by an SHVC conforming encoder. © 1991-2012 IEEE.