Anti-aliasing deep image classifiers using novel depth adaptive blurring and activation function
- Authors: Hossain, Md Tahmid , Teng, Shyh , Lu, Guojun , Rahman, Mohammad Arifur , Sohel, Ferdous
- Date: 2023
- Type: Text , Journal article
- Relation: Neurocomputing Vol. 536, no. (2023), p. 164-174
- Full Text: false
- Reviewed:
- Description: Deep convolutional networks are vulnerable to image translation or shift, partly due to common down-sampling layers, e.g., max-pooling and strided convolution. These operations violate the Nyquist sampling rate and cause aliasing. The textbook solution is low-pass filtering (blurring) before down-sampling, which can benefit deep networks as well. Even so, non-linearity units, such as ReLU, often re-introduce the problem, suggesting that blurring alone may not suffice. In this work, first, we analyse deep features with Fourier transform and show that Depth Adaptive Blurring is more effective, as opposed to monotonic blurring. To this end, we propose a novel Depth Adaptive Blur-pool (DAB-pool) module to replace existing down-sampling methods. Second, we introduce a novel activation function – with a built-in low pass filter, as an additional measure, to keep the problem from reappearing. From experiments, we observe generalisation on other forms of transformations and corruptions as well, e.g., rotation, scale, and noise. We evaluate our method under three challenging settings: (1) a variety of image translations; (2) adversarial attacks – both
Bidirectional mapping coupled GAN for generalized zero-shot learning
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Image Processing Vol. 31, no. (2022), p. 721-733
- Full Text:
- Reviewed:
- Description: Bidirectional mapping-based generalized zero-shot learning (GZSL) methods rely on the quality of synthesized features to recognize seen and unseen data. Therefore, learning a joint distribution of seen-unseen classes and preserving the distinction between seen-unseen classes is crucial for GZSL methods. However, existing methods only learn the underlying distribution of seen data, although unseen class semantics are available in the GZSL problem setting. Most methods neglect retaining seen-unseen classes distinction and use the learned distribution to recognize seen and unseen data. Consequently, they do not perform well. In this work, we utilize the available unseen class semantics alongside seen class semantics and learn joint distribution through a strong visual-semantic coupling. We propose a bidirectional mapping coupled generative adversarial network (BMCoGAN) by extending the concept of the coupled generative adversarial network into a bidirectional mapping model. We further integrate a Wasserstein generative adversarial optimization to supervise the joint distribution learning. We design a loss optimization for retaining distinctive information of seen-unseen classes in the synthesized features and reducing bias towards seen classes, which pushes synthesized seen features towards real seen features and pulls synthesized unseen features away from real seen features. We evaluate BMCoGAN on benchmark datasets and demonstrate its superior performance against contemporary methods. © 1992-2012 IEEE.
Integrated generalized zero-shot learning for fine-grained classification
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 122, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Embedding learning (EL) and feature synthesizing (FS) are two of the popular categories of fine-grained GZSL methods. EL or FS using global features cannot discriminate fine details in the absence of local features. On the other hand, EL or FS methods exploiting local features either neglect direct attribute guidance or global information. Consequently, neither method performs well. In this paper, we propose to explore global and direct attribute-supervised local visual features for both EL and FS categories in an integrated manner for fine-grained GZSL. The proposed integrated network has an EL sub-network and a FS sub-network. Consequently, the proposed integrated network can be tested in two ways. We propose a novel two-step dense attention mechanism to discover attribute-guided local visual features. We introduce new mutual learning between the sub-networks to exploit mutually beneficial information for optimization. Moreover, we propose to compute source-target class similarity based on mutual information and transfer-learn the target classes to reduce bias towards the source domain during testing. We demonstrate that our proposed method outperforms contemporary methods on benchmark datasets. © 2021 Elsevier Ltd
A novel fusion approach in the extraction of kernel descriptor with improved effectiveness and efficiency
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2021
- Type: Text , Journal article
- Relation: Multimedia Tools and Applications Vol. 80, no. 10 (Apr 2021), p. 14545-14564
- Full Text:
- Reviewed:
- Description: Image representation using feature descriptors is crucial. A number of histogram-based descriptors are widely used for this purpose. However, histogram-based descriptors have certain limitations and kernel descriptors (KDES) are proven to overcome them. Moreover, the combination of more than one KDES performs better than an individual KDES. Conventionally, KDES fusion is performed by concatenating them after the gradient, colour and shape descriptors have been extracted. This approach has limitations in regard to the efficiency as well as the effectiveness. In this paper, we propose a novel approach to fuse different image features before the descriptor extraction, resulting in a compact descriptor which is efficient and effective. In addition, we have investigated the effect on the proposed descriptor when texture-based features are fused along with the conventionally used features. Our proposed descriptor is examined on two publicly available image databases and shown to provide outstanding performances.
Adversarial network with multiple classifiers for open set domain adaptation
- Authors: Shermin, Tasfia , Lu, Guojun , Teng, Shyh , Murshed, Manzur , Sohel, Ferdous
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia Vol. 23, no. (2021), p. 2732-2744
- Full Text:
- Reviewed:
- Description: Domain adaptation aims to transfer knowledge from a domain with adequate labeled samples to a domain with scarce labeled samples. Prior research has introduced various open set domain adaptation settings in the literature to extend the applications of domain adaptation methods in real-world scenarios. This paper focuses on the type of open set domain adaptation setting where the target domain has both private ('unknown classes') label space and the shared ('known classes') label space. However, the source domain only has the 'known classes' label space. Prevalent distribution-matching domain adaptation methods are inadequate in such a setting that demands adaptation from a smaller source domain to a larger and diverse target domain with more classes. For addressing this specific open set domain adaptation setting, prior research introduces a domain adversarial model that uses a fixed threshold for distinguishing known from unknown target samples and lacks at handling negative transfers. We extend their adversarial model and propose a novel adversarial domain adaptation model with multiple auxiliary classifiers. The proposed multi-classifier structure introduces a weighting module that evaluates distinctive domain characteristics for assigning the target samples with weights which are more representative to whether they are likely to belong to the known and unknown classes to encourage positive transfers during adversarial training and simultaneously reduces the domain gap between the shared classes of the source and target domains. A thorough experimental investigation shows that our proposed method outperforms existing domain adaptation methods on a number of domain adaptation datasets. © 1999-2012 IEEE.
Integrating line weber local descriptor and deep feature for tire indentation mark image classification
- Authors: Liu, Ying , Che, Xin , Dong, Haitao , Li, Daxiang , Teng, Shyh , Lu, Guojun
- Date: 2021
- Type: Text , Conference paper
- Relation: 4th International Conference on Artificial Intelligence and Pattern Recognition, 4th International Conference on Artificial Intelligence and Pattern Recognition, AIPR 2021,Virtual, Online,17-19 September 2021, 2021, ACM International Conference Proceeding Series p. 56-61
- Full Text: false
- Reviewed:
- Description: Tire indentation mark matching is an essential tool used for the investigation of criminal cases and traffic incidents. As such images are unique and uncommon, there is a lack of dedicated databases and relevant research on this topic. This paper presents a feature extraction algorithm effective for tire indentation mark image description. The main contributions include: (1) Line feature Weber local descriptor (LWLD) is proposed, which uses the Gabor orientations instead of the original gradient orientation. This feature can describe texture information of tire indentation mark image more efficiently. (2) An attention model is constructed to produce attention feature map of tire indentation mark image. This attention feature map is then fused with LWLD resulting in a feature with more powerful representation capability. Experimental results prove that the combined use of LWLD and attention model greatly enhances the performance of tire indentation mark image matching tasks. © 2021 ACM.
Robust image classification using a low-pass activation function and DCT augmentation
- Authors: Hossain, Md Tahmid , Teng, Shyh , Sohel, Ferdous , Lu, Guojun
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Access Vol. 9, no. (2021), p. 86460-86474
- Full Text:
- Reviewed:
- Description: Convolutional Neural Network's (CNN's) performance disparity on clean and corrupted datasets has recently come under scrutiny. In this work, we analyse common corruptions in the frequency domain, i.e., High Frequency corruptions (HFc, e.g., noise) and Low Frequency corruptions (LFc, e.g., blur). Although a simple solution to HFc is low-pass filtering, ReLU - a widely used Activation Function (AF), does not have any filtering mechanism. In this work, we instill low-pass filtering into the AF (LP-ReLU) to improve robustness against HFc. To deal with LFc, we complement LP-ReLU with Discrete Cosine Transform based augmentation. LP-ReLU, coupled with DCT augmentation, enables a deep network to tackle the entire spectrum of corruption. We use CIFAR-10-C and Tiny ImageNet-C for evaluation and demonstrate improvements of 5% and 7.3% in accuracy respectively, compared to the State-Of-The-Art (SOTA). We further evaluate our method's stability on a variety of perturbations in CIFAR-10-P and Tiny ImageNet-P, achieving new SOTA in these experiments as well. To further strengthen our understanding regarding CNN's lack of robustness, a decision space visualisation process is proposed and presented in this work. © 2013 IEEE.
An enhancement to the spatial pyramid matching for image classification and retrieval
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Access Vol. 8, no. (2020), p. 22463-22472
- Full Text:
- Reviewed:
- Description: Spatial pyramid matching (SPM) is one of the widely used methods to incorporate spatial information into the image representation. Despite its effectiveness, the traditional SPM is not rotation invariant. A rotation invariant SPM has been proposed in the literature but it has many limitations regarding the effectiveness. In this paper, we investigate how to make SPM robust to rotation by addressing those limitations. In an SPM framework, an image is divided into an increasing number of partitions at different pyramid levels. In this paper, our main focus is on how to partition images in such a way that the resulting structure can deal with image-level rotations. To do that, we investigate three concentric ring partitioning schemes. Apart from image partitioning, another important component of the SPM framework is a weight function. To apportion the contribution of each pyramid level to the final matching between two images, the weight function is needed. In this paper, we propose a new weight function which is suitable for the rotation-invariant SPM structure. Experiments based on image classification and retrieval are performed on five image databases. The detailed result analysis shows that we are successful in enhancing the effectiveness of SPM for image classification and retrieval. © 2013 IEEE.
A detector of structural similarity for multi-modal microscopic image registration
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Multimedia Tools and Applications Vol. 77, no. 6 (2018), p. 7675-7701
- Full Text: false
- Reviewed:
- Description: This paper presents a Detector of Structural Similarity (DSS) to minimize the visual differences between brightfield and confocal microscopic images. The context of this work is that it is very challenging to effectively register such images due to a low structural similarity in image contents. To address this issue, DSS aims to maximize the structural similarity by utilizing the intensity relationships among red-green-blue (RGB) channels in images. Technically, DSS can be combined with any multi-modal image registration technique in registering brightfield and confocal microscopic images. Our experimental results show that DSS significantly increases the visual similarity in such images, thereby improving the registration performance of an existing state-of-the-art multi-modal image registration technique by up to approximately 27%. © 2017, Springer Science+Business Media New York.
A kernel-based approach for content-based image retrieval
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand; Auckland, New Zealand; 19th-21st November 2018 p. 1-6
- Full Text: false
- Reviewed:
- Description: Content-based image retrieval (CBIR) is a popular approach to retrieve images based on a query. In CBIR, retrieval is executed based on the properties of image contents (e.g. gradient, shape, color, texture) which are generally encoded into image descriptors. Among the various image descriptors, histogram-based descriptors are very popular. However, they suffer from the limitation of coarse quantization. In contrast, the use of kernel descriptors (KDES) is proven to be more effective than histogram-based descriptors in other applications, e.g. image classification. This is because, in the KDES framework, instead of the quantization of pixel attributes, each pixel equally takes part in the similarity measurement between two images. In this paper, we propose an approach for how the conventional KDES and its improved version can be used for CBIR. In addition, we have provided a detailed insight into the effectiveness of improved kernel descriptors. Finally, our experiment results will show that kernel descriptors are significantly more effective than histogram-based descriptors in CBIR.
A new image dissimilarity measure incorporating human perception
- Authors: Shojanazeri, Hamid , Teng, Shyh , Aryal, Sunil , Zhang, Dengsheng , Lu, Guojun
- Date: 2018
- Type: Text , Unpublished work
- Full Text:
- Description: Pairwise (dis) similarity measure of data objects is central to many applications of image anlaytics, such as image retrieval and classification. Geometric distance, particularly Euclidean distance ((
A novel perceptual dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Zhang, Dengsheng , Teng, Shyh , Aryal, Sunil , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 Vol. 2018-November, p. 1-6
- Full Text: false
- Reviewed:
- Description: Similarity measure is an important research topic in image classification and retrieval. Given a type of image features, a good similarity measure should be able to retrieve similar images from the database while discard irrelevant images from the retrieval. Similarity measures in literature are typically distance based which measure the spatial distance between two feature vectors in high dimensional feature space. However, this type of similarity measures do not have any perceptual meaning and ignore the neighborhood influence in the similarity decision making process. In this paper, we propose a novel dissimilarity measure, which can measure both the distance and perceptual similarity of two image features in feature space. Results show the proposed similarity measure has a significant improvement over the traditional distance based similarity measure commonly used in literature.
- Description: International Conference Image and Vision Computing New Zealand
COREG : A corner based registration technique for multimodal images
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Multimedia Tools and Applications Vol. 77, no. 10 (2018), p. 12607-12634
- Full Text: false
- Reviewed:
- Description: This paper presents a COrner based REGistration technique for multimodal images (referred to as COREG). The proposed technique focuses on addressing large content and scale differences in multimodal images. Unlike traditional multimodal image registration techniques that rely on intensities or gradients for feature representation, we propose to use contour-based corners. First, curvature similarity between corners are for the first time explored for the purpose of multimodal image registration. Second, a novel local descriptor called Distribution of Edge Pixels Along Contour (DEPAC) is proposed to represent the edges in the neighborhood of corners. Third, a simple yet effective way of estimating scale difference is proposed by making use of geometric relationships between corner triplets from the reference and target images. Using a set of benchmark multimodal images and multimodal microscopic images, we will demonstrate that our proposed technique outperforms a state-of-the-art multimodal image registration technique. © 2017, Springer Science+Business Media, LLC.
Enhanced colour image retrieval with cuboid segmentation
- Authors: Murshed, Manzur , Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018; Canberra, Australia; 10th-13th December 2018
- Full Text:
- Reviewed:
- Description: In this paper, we further investigate our recently proposed cuboid image segmentation algorithm for effective image retrieval. Instead of using all cuboids (i.e. segments), we have proposed two approaches to choose different subsets of cuboids appropriately. With the experimental results on eBay dataset, we have shown that our proposals outperform retrieval performance of the existing technique. In addition, we have investigated how many segments are required for the most effective image retrieval and provide a quick method to determine the suitable number of cuboids.
- Description: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
Enhancing image registration performance by incorporating distribution and spatial distance of local descriptors
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Pattern Recognition Letters Vol. 103, no. (2018), p. 46-52
- Full Text: false
- Reviewed:
- Description: A data dependency similarity measure called mp-dissimilarity has been recently proposed. Unlike ℓp-norm distance which is widely used in calculating the similarity between vectors, mp-dissimilarity takes into account the relative positions of the two vectors with respect to the rest of the data. This paper investigates the potential of mp-dissimilarity in matching local image descriptors. Moreover, three new matching strategies are proposed by considering both ℓp-norm distance and mp-dissimilarity. Our proposed matching strategies are extensively evaluated against ℓp-norm distance and mp-dissimilarity on a few benchmark datasets. Experimental results show that mp-dissimilarity is a promising alternative to ℓp-norm distance in matching local descriptors. The proposed matching strategies outperform both ℓp-norm distance and mp-dissimilarity in matching accuracy. One of our proposed matching strategies is comparable to ℓp-norm distance in terms of recall vs 1-precision. © 2018 Elsevier B.V.
Enhancing the effectiveness of local descriptor based image matching
- Authors: Hossain, Md Tahmid , Teng, Shyh , Zhang, Dengsheng , Lim, Suryani , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018; Canberra, Australia; 10th-13th December 2018 p. 1-8
- Full Text: false
- Reviewed:
- Description: Image registration has received great attention from researchers over the last few decades. SIFT (Scale Invariant Feature Transform), a local descriptor-based technique is widely used for registering and matching images. To establish correspondences between images, SIFT uses a Euclidean Distance ratio metric. However, this approach leads to a lot of incorrect matches and eliminating these inaccurate matches has been a challenge. Various methods have been proposed attempting to mitigate this problem. In this paper, we propose a scale and orientation harmony-based pruning method that improves image matching process by successfully eliminating incorrect SIFT descriptor matches. Moreover, our technique can predict the image transformation parameters based on a novel adaptive clustering method with much higher matching accuracy. Our experimental results have shown that the proposed method has achieved averages of approximately 16% and 10% higher matching accuracy compared to the traditional SIFT and a contemporary method respectively.
- Description: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
Image clustering using a similarity measure incorporating human perception
- Authors: Shojanazeri, Hamid , Aryal, Sunil , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 p. 1-6
- Full Text: false
- Reviewed:
- Description: Clustering similar images is an important task in image processing and computer vision. It requires a measure to quantify pairwise similarities of images. The performance of clustering algorithm depends on the choice of similarity measure. In this paper, we investigate the effectiveness of data independent (distance-based), data-dependent (mass-based) and hybrid (dis)similarity measures in the image clustering task using three benchmark image collections with different sets of features. Our results of K-Medoids clustering show that uses the hybrid Perceptual Dissimilarity Measure (PMD) produces better clustering results than distance-based l(p) - norm and mass-based m(p) - dissimilarity.
A Hybrid data dependent dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Teng, Shyh , Lu, Guojun
- Date: 2017
- Type: Text , Unpublished work
- Full Text:
- Description: Abstract— In image retrieval, an effective dissimilarity measure is required to retrieve the perceptually similar images. Minkowski-type (lp ) distance is widely used for image retrieval, however it has its limitations. It focuses on distance between image features and ignores the data distribution of the image features, which can play an important role in measuring perceptual similarity of images. !! also favours the most dominant components in calculating the total dissimilarity. A data dependent measure, named !! -dissimilarity, which estimates the dissimilarity using the data distribution, has been proposed recently. Rather than relying on geometric distance, it measures the dissimilarity between two instances in each dimension as a probability mass in a region that encloses the two instances. It considers two instances in a sparse region to be more similar than in a dense region. Using the probability of data mass enables all the dimensions of feature vectors to contribute in the final estimate of dissimilarity, so it does not just heavily bias towards the most dominant components. However, relying only on data distribution and completely ignoring the geometric distance raise another limitation. This can result in finding two instances similar only due to being in a sparse region, however if the geometric distance between them is large then they are not perceptually similar. To address this limitation we proposed a new hybrid data dependent dissimilarity (HDDD) measure that considers both data distribution as well as geometric distance. Our experimental results using Corel database and Caltech 101 show that (HDDD) leads to higher image retrieval performance than lp distance (lpD) and mp.
A hybrid data dependent dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing - Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 141-148
- Full Text: false
- Reviewed:
- Description: In image retrieval, an effective dissimilarity (or similarity) measure is required to retrieve the perceptually similar images. Minkowski-type distance is widely used for image retrieval, however it has its limitation. It focuses on distance between image features and ignores the data distribution of the image features, which can play an important role in measuring perceptual similarity of images. To address this limitation, a data dependent measure named m-p, which calculates the dissimilarity using the data distribution rather than geometric distance has been proposed recently. It considers two instances in a sparse region to be more similar than in a dense region. Relying only on data distribution and completely ignoring the geometric distance raise other limitations. This may result in finding two perceptually dissimilar instances similar due to being located in a sparse region or vice versa. We proposed a new hybrid dissimilarity measure and experimental results show that it addresses these limitations.
Cuboid segmentation for effective image retrieval
- Authors: Murshed, Manzur , Teng, Shyh , Lu, Guojun
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing : Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 884-891
- Full Text: false
- Reviewed:
- Description: Region-based image retrieval has been proven to be effective in finding relevant images. In this paper, we propose a cuboid im-age segmentation method which results in rectangle image partitions. Rectangle partitions are more suitable for image compression, retrieval and other image operations. We apply partitions in image retrieval in this paper. Our experimental results have shown that (1) the proposed partitioning method is effective in segmenting images into meaningful rectangles; (2) using colour partitions for image retrieval is more effective than using whole images; and (3) the partitioned approach has additional advantage of letting users to select certain objects/colours as queries to find more relevant images/objects. These three advantages could be important in crime scene investigation image indexing and retrieval. Moreover, the proposed technique is amenable to compressed-domain applications.