Structural image retrieval using automatic image annotation and region based inverted file
- Authors: Zhang, Dengsheng , Islam, Md , Lu, Guojun
- Date: 2013
- Type: Text , Journal article
- Relation: Journal of Visual Communication and Image Representation Vol. 24, no. 7 (2013), p. 1087-1098
- Full Text: false
- Reviewed:
- Description: Image retrieval has lagged far behind text retrieval despite more than two decades of intensive research effort. Most of the research on image retrieval in the last two decades are on content based image retrieval or image retrieval based on low level features. Recent research in this area focuses on semantic image retrieval using automatic image annotation. Most semantic image retrieval techniques in literature, however, treat an image as a bag of features/words while ignore the structural or spatial information in the image. In this paper, we propose a structural image retrieval method based on automatic image annotation and region based inverted file. In the proposed system, regions in an image are treated the same way as keywords in a structural text document, semantic concepts are learnt from image data to label image regions as keywords and weight is assigned to each keyword according to spatial position and relationship. As the result, images are indexed and retrieved in the same way as structural document retrieval. Specifically, images are broken down to regions which are represented using colour, texture and shape features. Region features are then quantized to create visual dictionaries which are similar to monolingual dictionaries like English or Chinese dictionaries. In the next step, a semantic dictionary similar to a bilingual dictionary like the English–Chinese dictionary is learnt to mapping image regions to semantic concepts. Finally, images are then indexed and retrieved using a novel region based inverted file data structure. Results show the proposed method has significant advantage over the widely used Bayesian annotation models.
Rotation invariant curvelet features for region based image retrieval
- Authors: Zhang, Dengsheng , Islam, Md , Lu, Guojun , Sumana, Ishrat
- Date: 2011
- Type: Text , Journal article
- Relation: International Journal of Computer Vision Vol. 98, no. 2 (2011), p. 187-201
- Full Text: false
- Reviewed:
- Description: There have been much interest and a large amount of research on content based image retrieval (CBIR) in recent years due to the ever increasing number of digital images. Texture features play a key role in CBIR. Many texture features exist in literature, however, most of them are neither rotation invariant nor robust to scale and other variations. Texture features based on Gabor filters have been shown with significant advantages over other methods, and they are adopted by MPEG-7 as one of the texture descriptors for image retrieval. In this paper, we propose a rotation invariant curvelet features for texture representation. With systematic analysis and rigorous experiments, we show that the proposed curvelet texture features significantly outperforms the widely used Gabor texture features. A novel region padding method is also proposed to apply curvelet transform to region based image retrieval. Retrieval results from standard image databases show that curvelet features are promising for both texture and region representation.
Multimodal image registration technique based on improved local feature descriptors
- Authors: Teng, Shyh , Hossain, Tanvir , Lu, Guojun
- Date: 2015
- Type: Text , Journal article
- Relation: Journal of Electronic Imaging Vol. 24, no. 1 (2015), p.
- Full Text:
- Reviewed:
- Description: Multimodal image registration has received significant research attention over the past decade, and the majority of the techniques are global in nature. Although local techniques are widely used for general image registration, there are only limited studies on them for multimodal image registration. Scale invariant feature transform (SIFT) is a well-known general image registration technique. However, SIFT descriptors are not invariant to multimodality. We propose a SIFT-based technique that is modality invariant and still retains the strengths of local techniques. Moreover, our proposed histogram weighting strategies also improve the accuracy of descriptor matching, which is an important image registration step. As a result, our proposed strategies can not only improve the multimodal registration accuracy but also have the potential to improve the performance of all SIFT-based applications, e.g., general image registration and object recognition.
Efficient nonlinear classification via low-rank regularised least squares
- Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
- Date: 2013
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 22, no. 7-8(2013), p. 1279-1289
- Full Text: false
- Reviewed:
- Description: We revisit the classical technique of regularised least squares (RLS) for nonlinear classification in this paper. Specifically, we focus on a low-rank formulation of the RLS, which has linear time complexity in the size of data set only, independent of both the number of classes and number of features. This makes low-rank RLS particularly suitable for problems with large data and moderate feature dimensions. Moreover, we have proposed a general theorem for obtaining the closed-form estimation of prediction values on a holdout validation set given the low-rank RLS classifier trained on the whole training data. It is thus possible to obtain an error estimate for each parameter setting without retraining and greatly accelerate the process of cross-validation for parameter selection. Experimental results on several large-scale benchmark data sets have shown that low-rank RLS achieves comparable classification performance while being much more efficient than standard kernel SVM for nonlinear classification. The improvement in efficiency is more evident for data sets with higher dimensions.
Effective and efficient contour-based corner detectors
- Authors: Teng, Shyh , Najmus Sadat, Rafi , Lu, Guojun
- Date: 2015
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 48, no. 7 (2015), p. 2185-2197
- Full Text: false
- Reviewed:
- Description: Corner detection is an essential operation in many computer vision applications. Among the contour-based corner detectors in the literature, the Chord-to-Point Distance Accumulation (CPDA) detector is reported to have one of the highest repeatability in detecting robust corners and the lowest localization error. However, based on our analysis, we found that the CPDA detector often fails to accurately detect the true corners when a curve has multiple corners but the sharpness of one or a few of them is much more prominent than the rest. This detector also might not perform well when the corners are closely located. Furthermore, the CPDA detector is also computationally very expensive. To overcome these weaknesses, we propose two effective and efficient corner detectors using simple triangular theory and distance calculation. Our experimental results show that our proposed detectors outperform CPDA and nine other existing corner detectors in terms of repeatability. Our proposed detectors also have a relatively low or comparable localization error and are computationally more efficient. © 2015 Elsevier Ltd.
Semantic image retrieval using region based inverted file
- Authors: Zhang, Dengsheng , Islam, Md , Lu, Guojun , Hou, Jin
- Date: 2009
- Type: Text , Journal article
- Relation: Journal of Visual Communication and Image Representation Vol. 24, no. 7 (2009), p.242-249
- Full Text: false
- Reviewed:
- Description: Image retrieval has lagged far behind text retrieval despite more than two decades of intensive research effort. Most of the research on image retrieval in the last two decades are on content based image retrieval or image retrieval based on low level features. Recent research in this area focuses on semantic image retrieval using automatic image annotation. Most semantic image retrieval techniques in literature, however, treat an image as a bag of features/words while ignore the structural or spatial information in the image. In this paper, we propose a structural image retrieval method based on automatic image annotation and region based inverted file. In the proposed system, regions in an image are treated the same way as keywords in a structural text document, semantic concepts are learnt from image data to label image regions as keywords and weight is assigned to each keyword according to spatial position and relationship. As the result, images are indexed and retrieved in the same way as structural document retrieval. Specifically, images are broken down to regions which are represented using colour, texture and shape features. Region features are then quantized to create visual dictionaries which are similar to monolingual dictionaries like English or Chinese dictionaries. In the next step, a semantic dictionary similar to a bilingual dictionary like the English–Chinese dictionary is learnt to mapping image regions to semantic concepts. Finally, images are then indexed and retrieved using a novel region based inverted file data structure. Results show the proposed method has significant advantage over the widely used Bayesian annotation models.
Connectivity-based shape descriptors
- Authors: Sajjanhar, Atul , Lu, Guojun , Zhang, Dengsheng , Zhou, Wanle
- Date: 2010
- Type: Text , Journal article
- Relation: International Journal of Computers and Applications Vol. 32, no. 1 (2010), p. 93-98
- Full Text: false
- Reviewed:
- Description: In this paper, we propose a method for indexing and retrieval of images based on shapes of objects. The concept of connectivity is introduced. 3D models are used to represent 2D images. 2D images are decomposed a priori using connectivity which is followed by 3D model construction. 3D model descriptors are obtained for 3D models and used to represent the underlying 2D shapes. We have used spherical harmonics descriptors as the 3D model descriptors. Difference between two images is computed as the Euclidean distance between their descriptors. Experiments are performed to test the effectiveness of spherical harmonics for retrieval of 2D images. The proposed method is compared with methods based on principal components analysis (PCA) and generic Fourier descriptors (GFD). It is found that the proposed method is effective. Item S8 within the MPEG-7 still images content set is used for performing experiments.
Robust image corner detection based on the chord-to-point distance accumulation technique
- Authors: Awrangjeb, Mohammad , Lu, Guojun
- Date: 2008
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia, vol. 10, no. 6, IEEE, p. 1059-1072
- Full Text:
A review on automatic image annotation techniques
- Authors: Zhang, Dengsheng , Islam, Md , Lu, Guojun
- Date: 2012
- Type: Text , Journal article
- Relation: Pattern Recognition Letters Vol. 45, no. 1 (2012), p. 346-362
- Full Text: false
- Reviewed:
- Description: Nowadays, more and more images are available. However, to find a required image for an ordinary user is a challenging task. Large amount of researches on image retrieval have been carried out in the past two decades. Traditionally, research in this area focuses on content based image retrieval. However, recent research shows that there is a semantic gap between content based image retrieval and image semantics understandable by humans. As a result, research in this area has shifted to bridge the semantic gap between low level image features and high level semantics. The typical method of bridging the semantic gap is through the automatic image annotation (AIA) which extracts semantic features using machine learning techniques. In this paper, we focus on this latest development in image retrieval and provide a comprehensive survey on automatic image annotation. We analyse key aspects of the various AIA methods, including both feature extraction and semantic learning methods. Major methods are discussed and illustrated in details. We report our findings and provide future research directions in the AIA area in the conclusions
A survey of audio-based music classification and annotation
- Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
- Date: 2011
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia Vol. 13, no. 2 (2011), p. 303-319
- Full Text: false
- Reviewed:
- Description: Music information retrieval (MIR) is an emerging research area that receives growing attention from both the research community and music industry. It addresses the problem of querying and retrieving certain types of music from large music data set. Classification is a fundamental problem in MIR. Many tasks in MIR can be naturally cast in a classification setting, such as genre classification, mood classification, artist recognition, instrument recognition, etc. Music annotation, a new research area in MIR that has attracted much attention in recent years, is also a classification problem in the general sense. Due to the importance of music classification in MIR research, rapid development of new methods, and lack of review papers on recent progress of the field, we provide a comprehensive review on audio-based classification in this paper and systematically summarize the state-of-the-art techniques for music classification. Specifically, we have stressed the difference in the features and the types of classifiers used for different classification tasks. This survey emphasizes on recent development of the techniques and discusses several open issues for future research.
Region-based image retrieval with high-level semantics using decision tree learning
- Authors: Liu, Ying , Zhang, Dengsheng , Lu, Guojun
- Date: 2008
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 41, no. 8 (2008), p. 2554-2570
- Full Text: false
- Reviewed:
- Description: Semantic-based image retrieval has attracted great interest in recent years. This paper proposes a region-based image retrieval system with high-level semantic learning. The key features of the system are: (1) it supports both query by keyword and query by region of interest. The system segments an image into different regions and extracts low-level features of each region. From these features, high-level concepts are obtained using a proposed decision tree-based learning algorithm named DT-ST. During retrieval, a set of images whose semantic concept matches the query is returned. Experiments on a standard real-world image database confirm that the proposed system significantly improves the retrieval performance, compared with a conventional content-based image retrieval system. (2) The proposed decision tree induction method DT-ST for image semantic learning is different from other decision tree induction algorithms in that it makes use of the semantic templates to discretize continuous-valued region features and avoids the difficult image feature discretization problem. Furthermore, it introduces a hybrid tree simplification method to handle the noise and tree fragmentation problems, thereby improving the classification performance of the tree. Experimental results indicate that DT-ST outperforms two well-established decision tree induction algorithms ID3 and C4.5 in image semantic learning.
Performance comparisons of contour-based corner detectors
- Authors: Awrangjeb, Mohammad , Lu, Guojun , Fraser, Clive
- Date: 2012
- Type: Text , Journal article
- Relation: IEEE Transactions on Image Processing Vol. 21, no. 9 (2012), p. 4167-4179
- Full Text: false
- Reviewed:
- Description: Abstract— Corner detectors have many applications in computer vision and image identification and retrieval. Contour-based corner detectors directly or indirectly estimate a significance measure (e.g., curvature) on the points of a planar curve, and select the curvature extrema points as corners. While an extensive number of contour-based corner detectors have been proposed over the last four decades, there is no comparative study of recently proposed detectors. This paper is an attempt to fill this gap. The general framework of contour-based corner detection is presented, and two major issues – curve smoothing and curvature estimation, which have major impacts on the corner detection performance, are discussed. A number of promising detectors are compared using both automatic and manual evaluation systems on two large datasets. It is observed that while the detectors using indirect curvature estimation techniques are more robust, the detectors using direct curvature estimation techniques are faster.
An improved curvature scale-space corner detector and a robust corner matching approach for transformed image identification
- Authors: Awrangjeb, Mohammad , Lu, Guojun
- Date: 2008
- Type: Text , Journal article
- Relation: Image Processing, IEEE Transactions Vol. 17, no. 12 (2008), p. 2425-2441
- Full Text: false
- Reviewed:
- Description: There are many applications, such as image copyright protection, where transformed images of a given test image need to be identified. The solution to this identification problem consists of two main stages. In stage one, certain representative features, such as corners, are detected in all images. In stage two, the representative features of the test image and the stored images are compared to identify the transformed images for the test image. Curvature scale-space (CSS) corner detectors look for curvature maxima or inflection points on planar curves. However, the arc-length used to parameterize the planar curves by the existing CSS detectors is not invariant to geometric transformations such as scaling. As a solution to stage one, this paper presents an improved CSS corner detector using the affine-length parameterization which is relatively invariant to affine transformations. We then present an improved corner matching technique as a solution to the stage two. Finally, we apply the proposed corner detection and matching techniques to identify the transformed images for a given image and report the promising results.
Learning sparse kernel classifiers for multi-instance classification
- Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
- Date: 2013
- Type: Text , Journal article
- Relation: IEEE Transactions on Neural Networks and Learning Systems Vol. 24, no. 9 (2013), p. 1377-1389
- Full Text: false
- Reviewed:
- Description: We propose a direct approach to learning sparse kernel classifiers for multi-instance (MI) classification to improve efficiency while maintaining predictive accuracy. The proposed method builds on a convex formulation for MI classification by considering the average score of individual instances for bag-level prediction. In contrast, existing formulations used the maximum score of individual instances in each bag, which leads to nonconvex optimization problems. Based on the convex MI framework, we formulate a sparse kernel learning algorithm by imposing additional constraints on the objective function to enforce the maximum number of expansions allowed in the prediction function. The formulated sparse learning problem for the MI classification is convex with respect to the classifier weights. Therefore, we can employ an effective optimization strategy to solve the optimization problem that involves the joint learning of both the classifier and the expansion vectors. In addition, the proposed formulation can explicitly control the complexity of the prediction model while still maintaining competitive predictive performance. Experimental results on benchmark data sets demonstrate that our proposed approach is effective in building very sparse kernel classifiers while achieving comparable performance to the state-of-the-art MI classifiers.
Music classification via the bag-of-features approach
- Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
- Date: 2011
- Type: Text , Journal article
- Relation: Pattern Recognition Letters Vol. 32, no. 14 (2011), p. 1768-1777
- Full Text: false
- Reviewed:
- Description: A central problem in music information retrieval is audio-based music classification. Current music classification systems follow a frame-based analysis model. A whole song is split into frames, where a feature vector is extracted from each local frame. Each song can then be represented by a set of feature vectors. How to utilize the feature set for global song-level classification is an important problem in music classification. Previous studies have used summary features and probability models which are either overly restrictive in modeling power or numerically too difficult to solve. In this paper, we investigate the bag-of-features approach for music classification which can effectively aggregate the local features for song-level feature representation. Moreover, we have extended the standard bag-of-features approach by proposing a multiple codebook model to exploit the randomness in the generation of codebooks. Experimental results for genre classification and artist identification on benchmark data sets show that the proposed classification system is highly competitive against the standard methods.
Enhancing image registration performance by incorporating distribution and spatial distance of local descriptors
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Pattern Recognition Letters Vol. 103, no. (2018), p. 46-52
- Full Text: false
- Reviewed:
- Description: A data dependency similarity measure called mp-dissimilarity has been recently proposed. Unlike ℓp-norm distance which is widely used in calculating the similarity between vectors, mp-dissimilarity takes into account the relative positions of the two vectors with respect to the rest of the data. This paper investigates the potential of mp-dissimilarity in matching local image descriptors. Moreover, three new matching strategies are proposed by considering both ℓp-norm distance and mp-dissimilarity. Our proposed matching strategies are extensively evaluated against ℓp-norm distance and mp-dissimilarity on a few benchmark datasets. Experimental results show that mp-dissimilarity is a promising alternative to ℓp-norm distance in matching local descriptors. The proposed matching strategies outperform both ℓp-norm distance and mp-dissimilarity in matching accuracy. One of our proposed matching strategies is comparable to ℓp-norm distance in terms of recall vs 1-precision. © 2018 Elsevier B.V.
COREG : A corner based registration technique for multimodal images
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Multimedia Tools and Applications Vol. 77, no. 10 (2018), p. 12607-12634
- Full Text: false
- Reviewed:
- Description: This paper presents a COrner based REGistration technique for multimodal images (referred to as COREG). The proposed technique focuses on addressing large content and scale differences in multimodal images. Unlike traditional multimodal image registration techniques that rely on intensities or gradients for feature representation, we propose to use contour-based corners. First, curvature similarity between corners are for the first time explored for the purpose of multimodal image registration. Second, a novel local descriptor called Distribution of Edge Pixels Along Contour (DEPAC) is proposed to represent the edges in the neighborhood of corners. Third, a simple yet effective way of estimating scale difference is proposed by making use of geometric relationships between corner triplets from the reference and target images. Using a set of benchmark multimodal images and multimodal microscopic images, we will demonstrate that our proposed technique outperforms a state-of-the-art multimodal image registration technique. © 2017, Springer Science+Business Media, LLC.
A detector of structural similarity for multi-modal microscopic image registration
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Multimedia Tools and Applications Vol. 77, no. 6 (2018), p. 7675-7701
- Full Text: false
- Reviewed:
- Description: This paper presents a Detector of Structural Similarity (DSS) to minimize the visual differences between brightfield and confocal microscopic images. The context of this work is that it is very challenging to effectively register such images due to a low structural similarity in image contents. To address this issue, DSS aims to maximize the structural similarity by utilizing the intensity relationships among red-green-blue (RGB) channels in images. Technically, DSS can be combined with any multi-modal image registration technique in registering brightfield and confocal microscopic images. Our experimental results show that DSS significantly increases the visual similarity in such images, thereby improving the registration performance of an existing state-of-the-art multi-modal image registration technique by up to approximately 27%. © 2017, Springer Science+Business Media New York.
An automatic building extraction and regularisation technique using LiDAR point cloud data and orthoimage
- Authors: Gilani, Sayed Ali Naqi , Awrangjeb, Mohammad , Lu, Guojun
- Date: 2016
- Type: Text , Journal article
- Relation: Remote Sensing Vol. 8, no. 3 (2016), p. 1-27
- Full Text:
- Reviewed:
- Description: The development of robust and accurate methods for automatic building detection and regularisation using multisource data continues to be a challenge due to point cloud sparsity, high spectral variability, urban objects differences, surrounding complexity, and data misalignment. To address these challenges, constraints on object's size, height, area, and orientation are generally benefited which adversely affect the detection performance. Often the buildings either small in size, under shadows or partly occluded are ousted during elimination of superfluous objects. To overcome the limitations, a methodology is developed to extract and regularise the buildings using features from point cloud and orthoimagery. The building delineation process is carried out by identifying the candidate building regions and segmenting them into grids. Vegetation elimination, building detection and extraction of their partially occluded parts are achieved by synthesising the point cloud and image data. Finally, the detected buildings are regularised by exploiting the image lines in the building regularisation process. Detection and regularisation processes have been evaluated using the ISPRS benchmark and four Australian data sets which differ in point density (1 to 29 points/m2), building sizes, shadows, terrain, and vegetation. Results indicate that there is 83% to 93% per-area completeness with the correctness of above 95%, demonstrating the robustness of the approach. The absence of over- and many-to-many segmentation errors in the ISPRS data set indicate that the technique has higher per-object accuracy. While compared with six existing similar methods, the proposed detection and regularisation approach performs significantly better on more complex data sets (Australian) in contrast to the ISPRS benchmark, where it does better or equal to the counterparts. © 2016 by the authors.
Enhancing SIFT-based image registration performance by building and selecting highly discriminating descriptors
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2016
- Type: Text , Journal article
- Relation: Pattern Recognition Letters Vol. 84, no. (2016), p. 156-162
- Full Text: false
- Reviewed:
- Description: In this paper we will investigate the gradient utilization in building SIFT (Scale Invariant Feature Transform)-like descriptors for image registration. There are generally two types of gradient information, i.e. gradient magnitude and gradient occurrence, which can be used for building SIFT-like descriptors. We will provide a theoretical analysis on the effectiveness of each of the two types of gradient information when used individually. Based on our analysis, we will propose a novel technique which systematically uses both types of gradient information together for image registration. Moreover, we will propose a strategy to select keypoint matches with a higher discrimination. The proposed technique can be used for both mono-modal and multi-modal image registration. Our experimental results show that the proposed technique improves registration accuracy over existing SIFT-like descriptors. © 2016 Elsevier B.V.