Integrated generalized zero-shot learning for fine-grained classification
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 122, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Embedding learning (EL) and feature synthesizing (FS) are two of the popular categories of fine-grained GZSL methods. EL or FS using global features cannot discriminate fine details in the absence of local features. On the other hand, EL or FS methods exploiting local features either neglect direct attribute guidance or global information. Consequently, neither method performs well. In this paper, we propose to explore global and direct attribute-supervised local visual features for both EL and FS categories in an integrated manner for fine-grained GZSL. The proposed integrated network has an EL sub-network and a FS sub-network. Consequently, the proposed integrated network can be tested in two ways. We propose a novel two-step dense attention mechanism to discover attribute-guided local visual features. We introduce new mutual learning between the sub-networks to exploit mutually beneficial information for optimization. Moreover, we propose to compute source-target class similarity based on mutual information and transfer-learn the target classes to reduce bias towards the source domain during testing. We demonstrate that our proposed method outperforms contemporary methods on benchmark datasets. © 2021 Elsevier Ltd
A robust forgery detection method for copy-move and splicing attacks in images
- Authors: Islam, Mohammad , Karmakar, Gour , Kamruzzaman, Joarder , Murshed, Manzur
- Date: 2020
- Type: Text , Journal article
- Relation: Electronics Vol. 9, no. 9 (2020), p. 1-22
- Full Text:
- Reviewed:
- Description: Internet of Things (IoT) image sensors, social media, and smartphones generate huge volumes of digital images every day. Easy availability and usability of photo editing tools have made forgery attacks, primarily splicing and copy-move attacks, effortless, causing cybercrimes to be on the rise. While several models have been proposed in the literature for detecting these attacks, the robustness of those models has not been investigated when (i) a low number of tampered images are available for model building or (ii) images from IoT sensors are distorted due to image rotation or scaling caused by unwanted or unexpected changes in sensors' physical set-up. Moreover, further improvement in detection accuracy is needed for real-word security management systems. To address these limitations, in this paper, an innovative image forgery detection method has been proposed based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) and a new feature extraction method using the mean operator. First, images are divided into non-overlapping fixed size blocks and 2D block DCT is applied to capture changes due to image forgery. Then LBP is applied to the magnitude of the DCT array to enhance forgery artifacts. Finally, the mean value of a particular cell across all LBP blocks is computed, which yields a fixed number of features and presents a more computationally efficient method. Using Support Vector Machine (SVM), the proposed method has been extensively tested on four well known publicly available gray scale and color image forgery datasets, and additionally on an IoT based image forgery dataset that we built. Experimental results reveal the superiority of our proposed method over recent state-of-the-art methods in terms of widely used performance metrics and computational time and demonstrate robustness against low availability of forged training samples.
- Description: This research was funded by Research Priority Area (RPA) scholarship of Federation University Australia.
Depth sequence coding with hierarchical partitioning and spatial-domain quantization
- Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 30, no. 3 (2020), p. 835-849
- Full Text:
- Reviewed:
- Description: Depth coding in 3D-HEVC deforms object shapes due to block-level edge-approximation and lacks efficient techniques to exploit the statistical redundancy, due to the frame-level clustering tendency in depth data, for higher coding gain at near-lossless quality. This paper presents a standalone mono-view depth sequence coder, which preserves edges implicitly by limiting quantization to the spatial-domain and exploits the frame-level clustering tendency efficiently with a novel binary tree-based decomposition (BTBD) technique. The BTBD can exploit the statistical redundancy in frame-level syntax, motion components, and residuals efficiently with fewer block-level prediction/coding modes and simpler context modeling for context-adaptive arithmetic coding. Compared with the depth coder in 3D-HEVC, the proposed one has achieved significantly lower bitrate at lossless to near-lossless quality range for mono-view coding and rendered superior quality synthetic views from the depth maps, compressed at the same bitrate, and the corresponding texture frames. © 1991-2012 IEEE.
Improved depth coding for HEVC focusing on depth edge approximation
- Authors: Podder, Pallab , Paul, Manoranjan , Rahaman, Motiur , Murshed, Manzur
- Date: 2017
- Type: Text , Journal article , acceptedVersion
- Relation: Signal Processing: Image Communication Vol. 55, no. (2017), p. 80-92
- Full Text:
- Reviewed:
- Description: The latest High Efficiency Video Coding (HEVC) standard has greatly improved the coding efficiency compared to its predecessor H.264. An important share of which is the adoption of hierarchical block partitioning structures and an extended number of modes. The structure of existing inter-modes is appropriate mainly to handle the rectangular and square aligned motion patterns. However, they could not be suitable for the block partitioning of depth objects having partial foreground motion with irregular edges and background. In such cases, the HEVC reference test model (HM) normally explores finer level block partitioning that requires more bits and encoding time to compensate large residuals. Since motion detection is the underlying criteria for mode selection, in this work, we use the energy concentration ratio feature of phase correlation to capture different types of motion in depth object. For better motion modeling focusing at depth edges, the proposed technique also uses an extra pattern mode comprising a group of templates with various rectangular and non-rectangular object shapes and edges. As the pattern mode could save bits by encoding only the foreground areas and beat all other inter-modes in a block once selected, the proposed technique could improve the rate-distortion performance. It could also reduce encoding time by skipping further branching using the pattern mode and selecting a subset of modes using innovative pre-processing criteria. Experimentally it could save 29% average encoding time and improve 0.10 dB Bjontegaard Delta peak signal-to-noise ratio compared to the HM.
A hybrid wireless sensor network framework for range-free event localization
- Authors: Iqbal, Anindya , Murshed, Manzur
- Date: 2015
- Type: Text , Journal article
- Relation: Ad Hoc Networks Vol. 27, no. (2015), p. 81-98
- Full Text: false
- Reviewed:
- Description: In event localization, wireless sensors try to locate the source of an event from its emitted power. This is more challenging than sensor localization as the power level at the source of an event is neither predictable with precision nor can be controlled. Considering the emerging trend of long sensing range for cost-effective sensor deployment, locating events within a region much smaller than the sensing area of a single sensor has gained research interest. This paper proposes the first range-free event localization framework, which avoids expensive hardware needed by the range-based counterparts. Our approach first develops a sensing range model from the statistical information on the emitted power of a type of events so that user-defined event-detection quality can be provisioned using a minimal network of static sensors. Then an accurate event location boundary estimation technique is developed from the sensing feedbacks, which also facilitates guided expansion of the area of possible event location (APEL) to deal with sensing errors. Finally, user-defined event-localization quality guarantee is provisioned cost-effectively by inviting mobile sensors on-demand to target positions. Analytical solutions are provided whenever appropriate and comprehensive simulations are carried out to evaluate localization performance. The proposed event localization technique outperforms the state-of-the-art range-based counterpart (Xu et al., 2011) in realistic environment with path loss, shadow fading, and sensor positioning errors.
Symbol coding of Laplacian distributed prediction residuals
- Authors: Ali, Mortuza , Murshed, Manzur
- Date: 2015
- Type: Text , Journal article
- Relation: Digital Signal Processing: A Review Journal Vol. 44, no. 1 (2015), p. 76-87
- Relation: http://purl.org/au-research/grants/arc/DP130103670
- Full Text: false
- Reviewed:
- Description: Predictive coding schemes, proposed in the literature, essentially model the residuals with discrete distributions. However, real-valued residuals can arise in predictive coding, for example, from the usage of an r order linear predictor specified by r real-valued coefficients. In this paper, we propose a symbol-by-symbol coding scheme for the Laplace distribution, which closely models the distribution of real-valued residuals in practice. To efficiently exploit the real-valued predictions at a given precision, the proposed scheme essentially combines the process of residual computation and coding, in contrast to conventional schemes that separate these two processes. In the context of adaptive predictive coding framework, where the source statistics must be learnt from the data, the proposed scheme has the advantage of lower 'model cost' as it involves learning only one parameter. In this paper, we also analyze the proposed parametric coding scheme to establish the relationship between the optimal value of the coding parameter and the scale parameter of the Laplace distribution. Our experimental results demonstrated the compression efficiency and computational simplicity of the proposed scheme in adaptive coding of residuals against the widely used arithmetic coding, Rice-Golomb coding, and the Merhav-Seroussi-Weinberger scheme adopted in JPEG-LS.
- Description: Predictive coding schemes, proposed in the literature, essentially model the residuals with discrete distributions. However, real-valued residuals can arise in predictive coding, for example, from the usage of an r order linear predictor specified by r real-valued coefficients. In this paper, we propose a symbol-by-symbol coding scheme for the Laplace distribution, which closely models the distribution of real-valued residuals in practice. To efficiently exploit the real-valued predictions at a given precision, the proposed scheme essentially combines the process of residual computation and coding, in contrast to conventional schemes that separate these two processes. In the context of adaptive predictive coding framework, where the source statistics must be learnt from the data, the proposed scheme has the advantage of lower 'model cost' as it involves learning only one parameter. In this paper, we also analyze the proposed parametric coding scheme to establish the relationship between the optimal value of the coding parameter and the scale parameter of the Laplace distribution. Our experimental results demonstrated the compression efficiency and computational simplicity of the proposed scheme in adaptive coding of residuals against the widely used arithmetic coding, Rice-Golomb coding, and the Merhav-Seroussi-Weinberger scheme adopted in JPEG-LS. © 2015 Elsevier Inc. All rights reserved.
Perception-inspired background subtraction
- Authors: Haque, Mahfuzul , Murshed, Manzur
- Date: 2013
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 23, no. 12 (2013 2013), p. 2127-2140
- Full Text: false
- Reviewed:
- Description: Developing universal and context-invariant methods is one of the hardest challenges in computer vision. Background subtraction (BS), an essential precursor in most machine vision applications used for foreground detection, is no exception. Due to overreliance on statistical observations, most BS techniques show unpredictable behavior in dynamic unconstrained scenarios in which the characteristics of the operating environment are either unknown or change drastically. To achieve superior foreground detection quality across unconstrained scenarios, we propose a new technique, called perception-inspired background subtraction (PBS), which avoids overreliance on statistical observations by making key modeling decisions based on the characteristics of human visual perception. PBS exploits the human perception-inspired confidence interval to associate an observed intensity value with another intensity value during both model learning and background-foreground classification. The concept of perception-inspired confidence interval is also used for identifying redundant samples, thus ensuring the optimal number of samples in the background model. Furthermore, PBS dynamically varies the model adaptation speed (learning rate) at pixel level based on observed scene dynamics to ensure faster adaptation of changed background regions, as well as longer retention of stationary foregrounds. Extensive experimental evaluations on a wide range of benchmark datasets validate the efficacy of PBS compared to the state of the art for unconstraint video analytics.
Video coding using arbitrarily shaped block partitions in globally optimal perspective
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2011
- Type: Text , Journal article
- Relation: EURASIP Journal on Advances in Signal Processing Vol. 16, no. (2011), p.
- Full Text:
- Reviewed:
- Description: Algorithms using content-based patterns to segment moving regions at the macroblock (MB) level have exhibited good potential for improved coding efficiency when embedded into the H.264 standard as an extra mode. The content-based pattern generation (CPG) algorithm provides local optimal result as only one pattern can be optimally generated from a given set of moving regions. But, it failed to provide optimal results for multiple patterns from entire sets. Obviously, a global optimal solution for clustering the set and then generation of multiple patterns enhances the performance farther. But a global optimal solution is not achievable due to the non-polynomial nature of the clustering problem. In this paper, we propose a near-optimal content-based pattern generation (OCPG) algorithm which outperforms the existing approach. Coupling OCPG, generating a set of patterns after clustering the MBs into several disjoint sets, with a direct pattern selection algorithm by allowing all the MBs in multiple pattern modes outperforms the existing pattern-based coding when embedded into the H.264.
Video coding focusing on block partitioning and occlusion
- Authors: Paul, Manoranjan , Murshed, Manzur
- Date: 2010
- Type: Text , Journal article
- Relation: IEEE Transactions on Image Processing Vol. 19, no. 3 (2010), p. 691-701
- Full Text: false
- Reviewed:
- Description: Among the existing block partitioning schemes, the pattern-based video coding (PVC) has already established its superiority at low bit-rate. Its innovative segmentation process with regular-shaped pattern templates is very fast as it avoids handling the exact shape of the moving objects. It also judiciously encodes the pattern-uncovered background segments capturing high level of interblock temporal redundancy without any motion compensation, which is favoured by the rate-distortion optimizer at low bit-rates. The existing PVC technique, however, uses a number of content-sensitive thresholds and thus setting them to any predefined values risks ignoring some of the macroblocks that would otherwise be encoded with patterns. Furthermore, occluded background can potentially degrade the performance of this technique. In this paper, a robust PVC scheme is proposed by removing all the content-sensitive thresholds, introducing a new similarity metric, considering multiple top-ranked patterns by the rate-distortion optimizer, and refining the Lagrangian multiplier of the H.264 standard for efficient embedding. A novel pattern-based residual encoding approach is also integrated to address the occlusion issue. Once embedded into the H.264 Baseline profile, the proposed PVC scheme improves the image quality perceptually significantly by at least 0.5 dB in low bit-rate video coding applications. A similar trend is observed for moderate to high bit-rate applications when the proposed scheme replaces the bi-directional predictive mode in the H.264 High profile.
Detection of multiple dynamic textures using feature space mapping
- Authors: Rahman, Ashfaqur , Murshed, Manzur
- Date: 2009
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 19, no. 5 (2009), p. 766-771
- Full Text: false
- Reviewed:
- Description: Abstract— Image sequences of smoke, fire, etc. are known as dynamic textures. Research is mostly limited to characterization of single dynamic textures. In this paper we address the problem of detecting the presence of multiple dynamic textures in an image sequence by establishing a correspondence between the feature space of dynamic textures and that of their mixture in an image sequence. Accuracy of our proposed technique is both analytically and empirically established with detection experiments yielding 92.5% average accuracy on a diverse set of dynamic texture mixtures in synthetically generated as well as real-world image sequences.
Prefix coding of integers with real-valued predictions using cosets
- Authors: Ali, Mortuza , Murshed, Manzur
- Date: 2007
- Type: Text , Journal article
- Relation: IEEE Communications Letters, vol. 11, no. 10, IEEE Communications Society, p. 814-816
- Full Text: false
- Description: In predictive coding of integers real-valued residuals are mapped to integers before encoding, leaving room for improvement by reducing the loss due to rounding. In this paper, we propose a new prefix coding scheme where actual integer values, instead of the residuals, are encoded using cosets with real domain predictions as the side information. This novel coding scheme outperforms Golomb-based coding by reducing the rounding loss with similar computational and memory complexity.