Integrated generalized zero-shot learning for fine-grained classification
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 122, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Embedding learning (EL) and feature synthesizing (FS) are two of the popular categories of fine-grained GZSL methods. EL or FS using global features cannot discriminate fine details in the absence of local features. On the other hand, EL or FS methods exploiting local features either neglect direct attribute guidance or global information. Consequently, neither method performs well. In this paper, we propose to explore global and direct attribute-supervised local visual features for both EL and FS categories in an integrated manner for fine-grained GZSL. The proposed integrated network has an EL sub-network and a FS sub-network. Consequently, the proposed integrated network can be tested in two ways. We propose a novel two-step dense attention mechanism to discover attribute-guided local visual features. We introduce new mutual learning between the sub-networks to exploit mutually beneficial information for optimization. Moreover, we propose to compute source-target class similarity based on mutual information and transfer-learn the target classes to reduce bias towards the source domain during testing. We demonstrate that our proposed method outperforms contemporary methods on benchmark datasets. © 2021 Elsevier Ltd
A detector of structural similarity for multi-modal microscopic image registration
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Multimedia Tools and Applications Vol. 77, no. 6 (2018), p. 7675-7701
- Full Text: false
- Reviewed:
- Description: This paper presents a Detector of Structural Similarity (DSS) to minimize the visual differences between brightfield and confocal microscopic images. The context of this work is that it is very challenging to effectively register such images due to a low structural similarity in image contents. To address this issue, DSS aims to maximize the structural similarity by utilizing the intensity relationships among red-green-blue (RGB) channels in images. Technically, DSS can be combined with any multi-modal image registration technique in registering brightfield and confocal microscopic images. Our experimental results show that DSS significantly increases the visual similarity in such images, thereby improving the registration performance of an existing state-of-the-art multi-modal image registration technique by up to approximately 27%. © 2017, Springer Science+Business Media New York.
COREG : A corner based registration technique for multimodal images
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Multimedia Tools and Applications Vol. 77, no. 10 (2018), p. 12607-12634
- Full Text: false
- Reviewed:
- Description: This paper presents a COrner based REGistration technique for multimodal images (referred to as COREG). The proposed technique focuses on addressing large content and scale differences in multimodal images. Unlike traditional multimodal image registration techniques that rely on intensities or gradients for feature representation, we propose to use contour-based corners. First, curvature similarity between corners are for the first time explored for the purpose of multimodal image registration. Second, a novel local descriptor called Distribution of Edge Pixels Along Contour (DEPAC) is proposed to represent the edges in the neighborhood of corners. Third, a simple yet effective way of estimating scale difference is proposed by making use of geometric relationships between corner triplets from the reference and target images. Using a set of benchmark multimodal images and multimodal microscopic images, we will demonstrate that our proposed technique outperforms a state-of-the-art multimodal image registration technique. © 2017, Springer Science+Business Media, LLC.
Enhancing image registration performance by incorporating distribution and spatial distance of local descriptors
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Pattern Recognition Letters Vol. 103, no. (2018), p. 46-52
- Full Text: false
- Reviewed:
- Description: A data dependency similarity measure called mp-dissimilarity has been recently proposed. Unlike ℓp-norm distance which is widely used in calculating the similarity between vectors, mp-dissimilarity takes into account the relative positions of the two vectors with respect to the rest of the data. This paper investigates the potential of mp-dissimilarity in matching local image descriptors. Moreover, three new matching strategies are proposed by considering both ℓp-norm distance and mp-dissimilarity. Our proposed matching strategies are extensively evaluated against ℓp-norm distance and mp-dissimilarity on a few benchmark datasets. Experimental results show that mp-dissimilarity is a promising alternative to ℓp-norm distance in matching local descriptors. The proposed matching strategies outperform both ℓp-norm distance and mp-dissimilarity in matching accuracy. One of our proposed matching strategies is comparable to ℓp-norm distance in terms of recall vs 1-precision. © 2018 Elsevier B.V.
Enhancing SIFT-based image registration performance by building and selecting highly discriminating descriptors
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2016
- Type: Text , Journal article
- Relation: Pattern Recognition Letters Vol. 84, no. (2016), p. 156-162
- Full Text: false
- Reviewed:
- Description: In this paper we will investigate the gradient utilization in building SIFT (Scale Invariant Feature Transform)-like descriptors for image registration. There are generally two types of gradient information, i.e. gradient magnitude and gradient occurrence, which can be used for building SIFT-like descriptors. We will provide a theoretical analysis on the effectiveness of each of the two types of gradient information when used individually. Based on our analysis, we will propose a novel technique which systematically uses both types of gradient information together for image registration. Moreover, we will propose a strategy to select keypoint matches with a higher discrimination. The proposed technique can be used for both mono-modal and multi-modal image registration. Our experimental results show that the proposed technique improves registration accuracy over existing SIFT-like descriptors. © 2016 Elsevier B.V.
Effective and efficient contour-based corner detectors
- Authors: Teng, Shyh , Najmus Sadat, Rafi , Lu, Guojun
- Date: 2015
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 48, no. 7 (2015), p. 2185-2197
- Full Text: false
- Reviewed:
- Description: Corner detection is an essential operation in many computer vision applications. Among the contour-based corner detectors in the literature, the Chord-to-Point Distance Accumulation (CPDA) detector is reported to have one of the highest repeatability in detecting robust corners and the lowest localization error. However, based on our analysis, we found that the CPDA detector often fails to accurately detect the true corners when a curve has multiple corners but the sharpness of one or a few of them is much more prominent than the rest. This detector also might not perform well when the corners are closely located. Furthermore, the CPDA detector is also computationally very expensive. To overcome these weaknesses, we propose two effective and efficient corner detectors using simple triangular theory and distance calculation. Our experimental results show that our proposed detectors outperform CPDA and nine other existing corner detectors in terms of repeatability. Our proposed detectors also have a relatively low or comparable localization error and are computationally more efficient. © 2015 Elsevier Ltd.
Multimodal image registration technique based on improved local feature descriptors
- Authors: Teng, Shyh , Hossain, Tanvir , Lu, Guojun
- Date: 2015
- Type: Text , Journal article
- Relation: Journal of Electronic Imaging Vol. 24, no. 1 (2015), p.
- Full Text:
- Reviewed:
- Description: Multimodal image registration has received significant research attention over the past decade, and the majority of the techniques are global in nature. Although local techniques are widely used for general image registration, there are only limited studies on them for multimodal image registration. Scale invariant feature transform (SIFT) is a well-known general image registration technique. However, SIFT descriptors are not invariant to multimodality. We propose a SIFT-based technique that is modality invariant and still retains the strengths of local techniques. Moreover, our proposed histogram weighting strategies also improve the accuracy of descriptor matching, which is an important image registration step. As a result, our proposed strategies can not only improve the multimodal registration accuracy but also have the potential to improve the performance of all SIFT-based applications, e.g., general image registration and object recognition.
Maximizing structural similarity in multimodal biomedical microscopic images for effective registration
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun , Lackmann, Martin
- Date: 2013
- Type: Text , Conference paper
- Relation: 2013 IEEE International Conference on Multimedia and Expo (ICME)
- Full Text: false
- Reviewed:
- Description: Multimodal image registration (MMIR) is the alignment of contents in images captured from different sensors or instruments. MMIR is important in medical applications as it enables the visualization of the complementary contents in biomedical microscopic images. The registration for such images can be challenging as the structures of their contents are usually only partially similar. Thus in this paper, we propose a new method to maximize the structural similarity of the contents in such images by utilizing intensity relationships among Red-Green-Blue color channels. Our experimental results will demonstrate that our proposed method substantially improves the accuracy of registering such images as compared to the state-of-the-art methods.
Achieving high multi-modal registration performance using simplified Hough-transform with improved symmetric-SIFT
- Authors: Hossain, Md Tanvir , Teng, Shyh , Lu, Guojun
- Date: 2012
- Type: Text , Conference paper
- Relation: 14th International Conference on Digital Image Computing Techniques and Applications, DICTA 2012
- Full Text: false
- Reviewed:
- Description: The traditional way of using Hough Transform with SIFT is for the purpose of reliable object recognition. However, it cannot be effectively applied to image registration in the same way as the recall rate can be significantly lower. In this paper, we propose an alternative implementation of Hough Transform that can be used with Improved Symmetric-SIFT for multi-modal image registration. Our experimental results show that the proposed technique of applying Hough Transform can significantly improve the key-point matching as well as registration accuracy by utilizing aggregated information from key-points throughout the input images.
Clustering gene expression data using ant-based heuristics
- Authors: Tan, Swee , Ting, Kaiming , Teng, Shyh
- Date: 2011
- Type: Text , Conference paper
- Relation: IEEE Congress on Evolutionary Computation (IEEE CEC) 2011 p. 1-8
- Full Text: false
- Reviewed:
- Description: ABSTRACT We consider the problem of finding the clusters in novel datasets in which the number of clusters is not known a priori; and little or no additional information is available for users to adjust the parameters in a clustering algorithm. We address this problem using a stochastic algorithm named SATTA (Simplified Adaptive Time Dependent Transporter), which attempts to find clusters without requiring users to specify the number of clusters or adjust any parameters. SATTA is then compared with Expectation Maximization Clustering, which is also able to estimate the number clusters using the principle of maximum likelihood and find the underlying clusters without any human interventions. Our results on seven gene expression datasets show that SATTA significantly outperforms Expectation Maximization Clustering in terms of clustering accuracy and efficiency. We discuss the conceptual differences between SATTA and EMC, which suggests that SATTA is a more promising alternative approach than Expectation Maximization Clustering when little or no additional information is available for clustering novel datasets.
- Description: ABSTRACT We consider the problem of finding the clusters in novel datasets in which the number of clusters is not known a priori; and little or no additional information is available for users to adjust the parameters in a clustering algorithm. We address this problem using a stochastic algorithm named SATTA (Simplified Adaptive Time Dependent Transporter), which attempts to find clusters without requiring users to specify the number of clusters or adjust any parameters. SATTA is then compared with Expectation Maximization Clustering, which is also able to estimate the number clusters using the principle of maximum likelihood and find the underlying clusters without any human interventions. Our results on seven gene expression datasets show that SATTA significantly outperforms Expectation Maximization Clustering in terms of clustering accuracy and efficiency. We discuss the conceptual differences between SATTA and EMC, which suggests that SATTA is a more promising alternative approach than Expectation Maximization Clustering when little or no additional information is available for clustering novel datasets. [less] 0 BOOKMARKS · 54 VIEWS
Feature-subspace aggregating: ensembles for stable and unstable learners
- Authors: Ting, Kaiming , Wells, Jonathan , Tan, Swee , Teng, Shyh , Webb, Geoffrey
- Date: 2011
- Type: Text , Journal article
- Relation: Machine Learning Vol. 82, no. 3 (2011), p. 375-397
- Full Text: false
- Reviewed:
- Description: This paper introduces a new ensemble approach, Feature-Subspace Aggregating (Feating), which builds local models instead of global models. Feating is a generic ensemble approach that can enhance the predictive performance of both stable and unstable learners. In contrast, most existing ensemble approaches can improve the predictive performance of unstable learners only. Our analysis shows that the new approach reduces the execution time to generate a model in an ensemble through an increased level of localisation in Feating. Our empirical evaluation shows that Feating performs significantly better than Boosting, Random Subspace and Bagging in terms of predictive accuracy, when a stable learner SVM is used as the base learner. The speed up achieved by Feating makes feasible SVM ensembles that would otherwise be infeasible for large data sets. When SVM is the preferred base learner, we show that Feating SVM performs better than Boosting decision trees and Random Forests. We further demonstrate that Feating also substantially reduces the error of another stable learner, k-nearest neighbour, and an unstable learner, decision tree.
Improved symmetric-SIFT for Multi-modal image registration
- Authors: Hossain, Md. Tanvir , Lv, Guohua , Teng, Shyh , Lu, Guojun , Lackmann, Martin
- Date: 2011
- Type: Text , Conference paper
- Relation: 2011 International Conference on Digital Image Computing: Techniques and Applications p. 197-202
- Full Text: false
- Reviewed:
- Description: Multi-modal image registration has received significant research attention over the past decade. SymmetricSIFT is a recently proposed local description technique that can be used for registering multi-modal images. It is based on a well-known general image registration technique named Scale Invariant Feature Transform (SIFT). Symmetric-SIFT, however, achieves this invariance to multi-modality at the cost of losing important information. In this paper, we show how this loss may adversely affect the accuracy of registration results. We then propose an improvement to Symmetric-SIFT to overcome the problem. Our experimental results show that the proposed technique can improve the number of true matches by up to 10 times and overall matching accuracy by up to 30%.
Improving SIFT's performance by incorporating appropriate gradient information
- Authors: Lv, Guohua , Hossain, Md. Tanvir , Teng, Shyh , Lu, Guojun , Lackmann, Martin
- Date: 2011
- Type: Text , Conference paper
- Relation: 26th Image and Vision Computing New Zealand Conference (IVCNZ 2011) p. 381 - 386
- Full Text: false
- Reviewed:
- Description: Scale Invariant Feature Transform (SIFT) has been applied in numerous applications especially in the domain of computer vision. In these applications, image information used for building the SIFT descriptor can have a significant impact on its performance. When building orientation histograms for descriptors, a critical step is how to increment the values in the orientation bins. The original scheme for this step in SIFT was improved in [6]. Two different types of gradient information are used for building orientation histograms. The limitations of the two schemes are identified in this paper and we then propose three new schemes which use both types of gradient information in the feature description and matching stages. Our experimental results show that the proposed schemes can achieve better registration performances than the schemes proposed in SIFT and [6].
Texture classification using multimodal invariant local binary pattern
- Authors: Sadat, Rafi , Teng, Shyh , Lu, Guojun , Hasan, Sheikh
- Date: 2011
- Type: Text , Conference paper
- Relation: IEEE Workshop on Applications of Computer Vision (WACV) p. 315-320
- Full Text: false
- Reviewed:
- Description: As texture information among pixels can be effectively represented using Local binary patterns (LBPs), image descriptors built using LBPs or its variants have been frequently used for various image analysis applications, e.g. medical image and texture image classification and retrieval. However, neither LBP nor any of its existing variants can be used to build descriptors for classifying multimodal images effectively. This is because the same object when captured in different modalities may result in opposite pixel intensity in some corresponding parts of the images, which in turn will cause their descriptors to be very different. To solve this problem, we propose a novel modality invariant texture descriptor which is built by modifying the standard procedure for building LBP. In this paper, we explain how the proposed descriptor can be built efficiently. We also demonstrate empirically that compared to all the state of the art LBP-based descriptors, the proposed descriptor achieves better accuracy for classifying multimodal images
A comparative study of practical stochastic clustering method with traditional methods
- Authors: Tan, Swee , Ting, Kaiming , Teng, Shyh
- Date: 2010
- Type: Text , Conference paper
- Relation: Proceedings of the 23rd Australasian Joint Conference on Artificial Intelligence p. 112-121
- Full Text: false
- Reviewed:
- Description: In many real-world clustering problems, there usually exist little information about the clusters underlying a certain dataset. For example, the number of clusters hidden in many datasets is usually not known a priori. This is an issue because many traditional clustering methods require such information as input. This paper examines a practical stochastic clustering method (PSCM) that has the ability to find clusters in datasets without requiring users to specify the centroids or the number of clusters. By comparing with traditional methods (k-means, self-organising map and hierarchical clustering methods), the performance of PSCM is found to be robust against overlapping clusters and clusters with uneven sizes. The proposed method also scales well with datasets having varying number of clusters and dimensions. Finally, our experimental results on real-world data confirm that the proposed method performs competitively against the traditional clustering methods in terms of clustering accuracy and efficiency.
An enhancement to SIFT-based techniques for image registration
- Authors: Hossain, Tanvir , Teng, Shyh , Lu, Guojun , Lackmann, Martin
- Date: 2010
- Type: Text , Conference paper
- Relation: Proceedings of the 2010 Digital Image Computing: Techniques and Applications p. 166-171
- Full Text: false
- Reviewed:
- Description: Symmetric-SIFT is a recently proposed local technique used for registering multimodal images. It is based on a well-known general image registration technique named Scale Invariant Feature Transform (SIFT). Symmetric SIFT makes use of the gradient magnitude information at the image's key regions to build the descriptors. In this paper, we highlight an issue with how the magnitude information is used in this process. This issue may result in similar descriptors being built to represent regions in images that are visually different. To address this issue, we have proposed two new strategies for weighting the descriptors. Our experimental results show that Symmetric-SIFT descriptors built using our proposed strategies can lead to better registration accuracy than descriptors built using the original Symmetric-SIFT technique. The issue highlighted and the two strategies proposed are also applicable to the general SIFT technique.
FaSS : Ensembles for stable learners
- Authors: Ting, Kaiming , Wells, Jonathan , Tan, Swee , Teng, Shyh , Webb, Geoffrey
- Date: 2009
- Type: Text , Conference paper
- Relation: 8th International Workshop on Multipul Classifier Systems (MCS 2009)
- Full Text: false
- Reviewed:
- Description: This paper introduces a new ensemble approach, Feature-Space Subdivision (FaSS), which builds local models instead of global models. FaSS is a generic ensemble approach that can use either stable or unstable models as its base models. In contrast, existing ensemble approaches which employ randomisation can only use unstable models. Our analysis shows that the new approach reduces the execution time to generate a model in an ensemble with an increased level of localisation in FaSS. Our empirical evaluation shows that FaSS performs significantly better than boosting in terms of predictive accuracy, when a stable learner SVM is used as the base learner. The speed up achieved by FaSS makes SVM ensembles a reality that would otherwise infeasible for large data sets, and FaSS SVM performs better than Boosting J48 and Random Forests when SVM is the preferred base learner
Issues of grid-cluster retrievals in swarm-based clustering
- Authors: Tan, Swee , Ting, Kaiming , Teng, Shyh
- Date: 2008
- Type: Text , Conference paper
- Relation: Proceedings of the 2008 IEEE World Congress on Computational Intelligence p. 511-518
- Full Text: false
- Reviewed:
- Description: One common approach in swarm-based clustering is to use agents to create a set of clusters on a two-dimensional grid, and then use an existing clustering method to retrieve the clusters on the grid. The second step, which we call grid-cluster retrieval, is an essential step to obtain an explicit partitioning of data. In this study, we highlight the issues in grid-cluster retrievals commonly neglected by researchers, and demonstrate the non-trivial difficulties involved. To tackle the issues, we then evaluate three methods: K-means, hierarchical clustering (Weighted Single-link) and density-based clustering (DBScan). Among the three methods, DBScan is the only method which has not been previously used for grid-cluster retrievals, yet it is shown to be the most suitable method in terms of effectiveness and efficiency.