List of Titles

Efficient nonlinear classification via low-rank regularised least squares

Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
Date: 2013
Type: Text , Journal article
Relation: Neural Computing and Applications Vol. 22, no. 7-8(2013), p. 1279-1289
Full Text: false
Reviewed:
Description: We revisit the classical technique of regularised least squares (RLS) for nonlinear classification in this paper. Specifically, we focus on a low-rank formulation of the RLS, which has linear time complexity in the size of data set only, independent of both the number of classes and number of features. This makes low-rank RLS particularly suitable for problems with large data and moderate feature dimensions. Moreover, we have proposed a general theorem for obtaining the closed-form estimation of prediction values on a holdout validation set given the low-rank RLS classifier trained on the whole training data. It is thus possible to obtain an error estimate for each parameter setting without retraining and greatly accelerate the process of cross-validation for parameter selection. Experimental results on several large-scale benchmark data sets have shown that low-rank RLS achieves comparable classification performance while being much more efficient than standard kernel SVM for nonlinear classification. The improvement in efficiency is more evident for data sets with higher dimensions.

High quality region-of-interest coding for video conferencing based remote general practitioner training

Authors: Murshed, Manzur , Siddique, Md Atiur Rahman , Islam, Saikat , Ali, Mortuza , Lu, Guojun , Villanueva, Elmer , Brown, James
Date: 2013
Type: Text , Conference paper
Relation: Proceedings of the International Conference on eHealth, Telemedicine, and Social Medicine (eTELEMED 2013), Wilmington, DE, 1st October 2013 pg 240-245
Full Text: false
Reviewed:

Integration of LIDAR data and orthoimage for automatic 3D building roof plane extraction

Authors: Awrangjeb, Mohammad , Fraser, Clive , Lu, Guojun
Date: 2013
Type: Text , Conference paper
Relation: 2013 IEEE International Conference on Multimedia and Expo (ICME)
Full Text:
Reviewed:
Description: Automatic 3D extraction of building roofs from remotely sensed data is important for many applications including city modeling. This paper proposes a new method for automatic 3D roof extraction through an effective integration of LIDAR (Light Detection And Ranging) data and multispectral orthoimagery. Using the ground height from a DEM (Digital Elevation Model), the raw LIDAR points are separated into two groups. The first group contains the ground points that are exploited to constitute a `ground mask'. The second group contains the non-ground points which are segmented using an innovative image line guided segmentation technique to extract the roof planes. The image lines extracted from the grey-scale version of the orthoimage are classified into several classes such as `ground', `tree', `roof edge' and `roof ridge' using the ground mask and colour and texture information from the orthoimagery. During roof plane extraction the lines from the later two classes are used to fit roof planes to the neighbouring non-ground LIDAR points. Finally, a new rule-based procedure is applied to remove planes constructed on trees. Experimental results show that the proposed method successfully removes vegetation and offers high extraction rates.

Learning sparse kernel classifiers for multi-instance classification

Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
Date: 2013
Type: Text , Journal article
Relation: IEEE Transactions on Neural Networks and Learning Systems Vol. 24, no. 9 (2013), p. 1377-1389
Full Text: false
Reviewed:
Description: We propose a direct approach to learning sparse kernel classifiers for multi-instance (MI) classification to improve efficiency while maintaining predictive accuracy. The proposed method builds on a convex formulation for MI classification by considering the average score of individual instances for bag-level prediction. In contrast, existing formulations used the maximum score of individual instances in each bag, which leads to nonconvex optimization problems. Based on the convex MI framework, we formulate a sparse kernel learning algorithm by imposing additional constraints on the objective function to enforce the maximum number of expansions allowed in the prediction function. The formulated sparse learning problem for the MI classification is convex with respect to the classifier weights. Therefore, we can employ an effective optimization strategy to solve the optimization problem that involves the joint learning of both the classifier and the expansion vectors. In addition, the proposed formulation can explicitly control the complexity of the prediction model while still maintaining competitive predictive performance. Experimental results on benchmark data sets demonstrate that our proposed approach is effective in building very sparse kernel classifiers while achieving comparable performance to the state-of-the-art MI classifiers.

Maximizing structural similarity in multimodal biomedical microscopic images for effective registration

Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun , Lackmann, Martin
Date: 2013
Type: Text , Conference paper
Relation: 2013 IEEE International Conference on Multimedia and Expo (ICME)
Full Text: false
Reviewed:
Description: Multimodal image registration (MMIR) is the alignment of contents in images captured from different sensors or instruments. MMIR is important in medical applications as it enables the visualization of the complementary contents in biomedical microscopic images. The registration for such images can be challenging as the structures of their contents are usually only partially similar. Thus in this paper, we propose a new method to maximize the structural similarity of the contents in such images by utilizing intensity relationships among Red-Green-Blue color channels. Our experimental results will demonstrate that our proposed method substantially improves the accuracy of registering such images as compared to the state-of-the-art methods.

Optimizing cepstral features for audio classification

Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
Date: 2013
Type: Text , Conference paper
Relation: International Joint Conference on Artificial Intelligence p. 1330-1336
Full Text: false
Reviewed:
Description: Cepstral features have been widely used in audio applications. Domain knowledge has played an important role in designing different types of cepstral features proposed in the literature. In this paper, we present a novel approach for learning optimized cepstral features directly from audio data to better discriminate between different categories of signals in classification tasks. We employ multi-layer feedforward neural networks to model the cepstral feature extraction process. The network weights are initialized to replicate a reference cepstral feature like the mel frequency cepstral coefficient. We then propose a embedded approach that integrates feature learning with the training of a support vector machine (SVM) classifier. A single optimization problem is formulated where the feature and classifier variables are optimized simultaneously so as to refine the initial features and minimize the classification risk. Experimental results have demonstrated the effectiveness of the proposed feature learning approach, outperforming competing methods by a large margin on benchmark data.

Structural image retrieval using automatic image annotation and region based inverted file

Authors: Zhang, Dengsheng , Islam, Md , Lu, Guojun
Date: 2013
Type: Text , Journal article
Relation: Journal of Visual Communication and Image Representation Vol. 24, no. 7 (2013), p. 1087-1098
Full Text: false
Reviewed:
Description: Image retrieval has lagged far behind text retrieval despite more than two decades of intensive research effort. Most of the research on image retrieval in the last two decades are on content based image retrieval or image retrieval based on low level features. Recent research in this area focuses on semantic image retrieval using automatic image annotation. Most semantic image retrieval techniques in literature, however, treat an image as a bag of features/words while ignore the structural or spatial information in the image. In this paper, we propose a structural image retrieval method based on automatic image annotation and region based inverted file. In the proposed system, regions in an image are treated the same way as keywords in a structural text document, semantic concepts are learnt from image data to label image regions as keywords and weight is assigned to each keyword according to spatial position and relationship. As the result, images are indexed and retrieved in the same way as structural document retrieval. Specifically, images are broken down to regions which are represented using colour, texture and shape features. Region features are then quantized to create visual dictionaries which are similar to monolingual dictionaries like English or Chinese dictionaries. In the next step, a semantic dictionary similar to a bilingual dictionary like the English–Chinese dictionary is learnt to mapping image regions to semantic concepts. Finally, images are then indexed and retrieved using a novel region based inverted file data structure. Results show the proposed method has significant advantage over the widely used Bayesian annotation models.

The impact of global and local features on multiple sequence alignment clustering-based near-duplicate video retrieval

Authors: Wang, Yandan , Lu, Guojun , Belkhatir, Mohammed , Messom, Christopher
Date: 2013
Type: Text , Conference paper
Relation: 14th Pacific-Rim Conference on Multimedia p. 669-677
Full Text: false
Reviewed:
Description: Traditionally, the performance of Near-Duplicate Video Retrieval (NDVR) is enhanced through different video features, matching scheme and indexing methods. The video features have been intensively investigated and it has been shown that local features outperform global features in terms of accuracy. However, local features have the expensive computational problem. Therefore, indexing structure is introduced to assist in scaling up, whilst the accuracy will drop slightly or dramatically in most time by using indexing approaches. Recent progress shows that NDVR based on clustering could reduce searching space while maintains equivalent retrieval accuracy compared to that of non-clustering based. In this paper, we will continue to evaluate clustering based NDVR, but using popular global and local features. Before conducting NDVR, dataset will be pre-processed offline into groups by using clustering algorithm that near-duplicate videos (NDVs) are assembled in the same cluster. Each cluster will be represented by member video or the centroid. The query video will then be compared to the representative videos instead of all videos in database (non-clustering based). Our experiment shows that clustering-based NDVR using global and local features outperforms than that of non-clustering based in terms of both retrieval accuracy and speed.

A class centric feature and classifier ensemble selection approach for music genre classification

Authors: Ariyaratne, Hasitha Bimsara , Zhang, Dengsheng , Lu, Guojun
Date: 2012
Type: Text , Conference paper
Relation: Joint IAPR International Workshop SSPR & SPR 2012 p. 666-674
Full Text: false
Reviewed:
Description: Music genre classification has attracted a lot of research interest due to the rapid growth of digital music. Despite the availability of a vast number of audio features and classification techniques, genre classification still remains a challenging task. In this work we propose a class centric feature and classifier ensemble selection method which deviates from the conventional practice of employing a single, or an ensemble of classifiers trained with a selected set of audio features. We adopt a binary decomposition technique to divide the multiclass problem into a set of binary problems which are then treated in a class specific manner. This differs from the traditional techniques which operate on the naive assumption that a specific set of features and/or classifiers can perform equally well in identifying all the classes. Experimental results obtained on a popular genre dataset and a newly created dataset suggest significant improvements over traditional techniques.

A review on automatic image annotation techniques

Authors: Zhang, Dengsheng , Islam, Md , Lu, Guojun
Date: 2012
Type: Text , Journal article
Relation: Pattern Recognition Letters Vol. 45, no. 1 (2012), p. 346-362
Full Text: false
Reviewed:
Description: Nowadays, more and more images are available. However, to find a required image for an ordinary user is a challenging task. Large amount of researches on image retrieval have been carried out in the past two decades. Traditionally, research in this area focuses on content based image retrieval. However, recent research shows that there is a semantic gap between content based image retrieval and image semantics understandable by humans. As a result, research in this area has shifted to bridge the semantic gap between low level image features and high level semantics. The typical method of bridging the semantic gap is through the automatic image annotation (AIA) which extracts semantic features using machine learning techniques. In this paper, we focus on this latest development in image retrieval and provide a comprehensive survey on automatic image annotation. We analyse key aspects of the various AIA methods, including both feature extraction and semantic learning methods. Major methods are discussed and illustrated in details. We report our findings and provide future research directions in the AIA area in the conclusions

Achieving high multi-modal registration performance using simplified Hough-transform with improved symmetric-SIFT

Authors: Hossain, Md Tanvir , Teng, Shyh , Lu, Guojun
Date: 2012
Type: Text , Conference paper
Relation: 14th International Conference on Digital Image Computing Techniques and Applications, DICTA 2012
Full Text: false
Reviewed:
Description: The traditional way of using Hough Transform with SIFT is for the purpose of reliable object recognition. However, it cannot be effectively applied to image registration in the same way as the recall rate can be significantly lower. In this paper, we propose an alternative implementation of Hough Transform that can be used with Improved Symmetric-SIFT for multi-modal image registration. Our experimental results show that the proposed technique of applying Hough Transform can significantly improve the key-point matching as well as registration accuracy by utilizing aggregated information from key-points throughout the input images.

An effective method of estimating scale-invariant interest region for representing corner features

Authors: Sadat, Rafi , Teng, Shyh , Lu, Guojun
Date: 2012
Type: Text , Conference paper
Relation: 27th Conference on Image and Vision Computing New Zealand p. 73-78
Full Text: false
Reviewed:
Description: To achieve scale-invariance, the approach used by many corner detection and description methods is to derive an appropriate scale as part of the process of detecting each corner and then use this scale for estimating region(s) around the corner to build the descriptor(s). However, this approach is not suitable for methods that do not derive such scale information in their corner detection process. This paper proposes a new method for selecting regions around a corner so that descriptors, which are invariant to scale changes and other image transformations, can be built to represent the corner. Our experimental results show that our proposed method achieves better precision-and-recall results than existing methods.

Comparison of curvelet and wavelet texture features for content based image retrieval

Authors: Sumana, Ishrat , Lu, Guojun , Zhang, Dengsheng
Date: 2012
Type: Text , Conference paper
Relation: 2012 IEEE International Conference on Multimedia and Expo (ICME) p. 290-295
Full Text: false
Reviewed:
Description: Texture feature plays a vital role in content based Image retrieval (CBIR). Wavelet texture feature modeled by generalized Gaussian density (GGD) [1] performs better than discrete wavelet texture feature. Curve let texture feature was proposed in [2]. In this paper, we compute a new texture feature by applying the generalized Gaussian density to the distribution of curve let coefficients which we call curve let GGD texture feature. The purpose of this paper is to investigate curve let GGD texture feature and compare its retrieval performance with that of curve let, wavelet and wavelet GGD texture features. Experimental results show that both curve let and curve let GGD features perform significantly better than wavelet and wavelet GGD texture features. Among the two types of curve let based features, curve let feature shows better performance in CBIR than curve let GGD texture feature. The findings are discussed in the paper.

Learning sparse kernel classifiers in the primal

Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
Date: 2012
Type: Text , Conference paper
Relation: Joint IAPR International Workshop, SSPR&SPR 2012; Hiroshima, Japan; 7th-9th November 2012; published in Structural, Syntactic, and Statistical Pattern Recognition (part of the Lecture Notes in Computer Science) Vol. 7626, p. 60-69
Full Text: false
Reviewed:
Description: The increasing number of classification applications in large data sets demands that efficient classifiers be designed not only in training but also for prediction. In this paper, we address the problem of learning kernel classifiers with reduced complexity and improved efficiency for prediction in comparison to those trained by standard methods. A single optimisation problem is formulated for classifier learning which optimises both classifier weights and eXpansion Vectors (XVs) that define the classification function in a joint fashion. Unlike the existing approach of Wu et al, which performs optimisation in the dual formulation, our approach solves the primal problem directly. The primal problem is much more efficient to solve, as it can be converted to the training of a linear classifier in each iteration, which scales linearly to the size of the data set and the number of expansions. This makes our primal approach highly desirable for large-scale applications, where the dual approach is inadequate and prohibitively slow due to the solution of cubic-time kernel SVM involved in each iteration. Experimental results have demonstrated the efficiency and effectiveness of the proposed primal approach for learning sparse kernel classifiers that clearly outperform the alternatives.

Performance comparisons of contour-based corner detectors

Authors: Awrangjeb, Mohammad , Lu, Guojun , Fraser, Clive
Date: 2012
Type: Text , Journal article
Relation: IEEE Transactions on Image Processing Vol. 21, no. 9 (2012), p. 4167-4179
Full Text: false
Reviewed:
Description: Abstract— Corner detectors have many applications in computer vision and image identification and retrieval. Contour-based corner detectors directly or indirectly estimate a significance measure (e.g., curvature) on the points of a planar curve, and select the curvature extrema points as corners. While an extensive number of contour-based corner detectors have been proposed over the last four decades, there is no comparative study of recently proposed detectors. This paper is an attempt to fill this gap. The general framework of contour-based corner detection is presented, and two major issues – curve smoothing and curvature estimation, which have major impacts on the corner detection performance, are discussed. A number of promising detectors are compared using both automatic and manual evaluation systems on two large datasets. It is observed that while the detectors using indirect curvature estimation techniques are more robust, the detectors using direct curvature estimation techniques are faster.

A survey of audio-based music classification and annotation

Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
Date: 2011
Type: Text , Journal article
Relation: IEEE Transactions on Multimedia Vol. 13, no. 2 (2011), p. 303-319
Full Text: false
Reviewed:
Description: Music information retrieval (MIR) is an emerging research area that receives growing attention from both the research community and music industry. It addresses the problem of querying and retrieving certain types of music from large music data set. Classification is a fundamental problem in MIR. Many tasks in MIR can be naturally cast in a classification setting, such as genre classification, mood classification, artist recognition, instrument recognition, etc. Music annotation, a new research area in MIR that has attracted much attention in recent years, is also a classification problem in the general sense. Due to the importance of music classification in MIR research, rapid development of new methods, and lack of review papers on recent progress of the field, we provide a comprehensive review on audio-based classification in this paper and systematically summarize the state-of-the-art techniques for music classification. Specifically, we have stressed the difference in the features and the types of classifiers used for different classification tasks. This survey emphasizes on recent development of the techniques and discusses several open issues for future research.

An effective and efficient contour-based corner detector using simple triangular theory

Authors: Sadat, Rafi , Teng, Shyh , Lu, Guojun
Date: 2011
Type: Text , Conference paper
Relation: 19th Pacific Conference on Computer Graphics and Applications p. 37-42
Full Text: false
Reviewed:
Description: Corner detection is an important operation in many computer vision applications. Among the contour-based corner detectors in the literature, the Chord-to-Point Distance Accumulation (CPDA) detector is reported to have one of the best repeatability and lowest localization error. However, we found that CPDA detector often fails to accurately detect the true corners in some situations. Furthermore, CPDA detector is also computationally expensive. To overcome these weaknesses of CPDA detector, we propose an effective but yet efficient corner detector using a simple triangular theory. Our experimental results show that our proposed detector outperforms CPDA and six other existing detectors in terms of repeatability. Our proposed detector also has one of the lowest localization error. Finally it is computationally the most efficient.

Building sparse support vector machines for multi-instance classification

Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
Date: 2011
Type: Text , Conference paper
Relation: European Conference on Machine Learning Knowledge Discovery in Databases (ECML PKDD) p. 471-486
Full Text: false
Reviewed:
Description: We propose a direct approach to learning sparse Support Vector Machine (SVM) prediction models for Multi-Instance (MI) classification. The proposed sparse SVM is based on a “label-mean” formulation of MI classification which takes the average of predictions of individual instances for bag-level prediction. This leads to a convex optimization problem, which is essential for the tractability of the optimization problem arising from the sparse SVM formulation we derived subsequently, as well as the validity of the optimization strategy we employed to solve it. Based on the “label-mean” formulation, we can build sparse SVM models for MI classification and explicitly control their sparsities by enforcing the maximum number of expansions allowed in the prediction function. An effective optimization strategy is adopted to solve the formulated sparse learning problem which involves the learning of both the classifier and the expansion vectors. Experimental results on benchmark data sets have demonstrated that the proposed approach is effective in building very sparse SVM models while achieving comparable performance to the state-of-the-art MI classifiers.

Improved symmetric-SIFT for Multi-modal image registration

Authors: Hossain, Md. Tanvir , Lv, Guohua , Teng, Shyh , Lu, Guojun , Lackmann, Martin
Date: 2011
Type: Text , Conference paper
Relation: 2011 International Conference on Digital Image Computing: Techniques and Applications p. 197-202
Full Text: false
Reviewed:
Description: Multi-modal image registration has received significant research attention over the past decade. SymmetricSIFT is a recently proposed local description technique that can be used for registering multi-modal images. It is based on a well-known general image registration technique named Scale Invariant Feature Transform (SIFT). Symmetric-SIFT, however, achieves this invariance to multi-modality at the cost of losing important information. In this paper, we show how this loss may adversely affect the accuracy of registration results. We then propose an improvement to Symmetric-SIFT to overcome the problem. Our experimental results show that the proposed technique can improve the number of true matches by up to 10 times and overall matching accuracy by up to 30%.

Improving SIFT's performance by incorporating appropriate gradient information

Authors: Lv, Guohua , Hossain, Md. Tanvir , Teng, Shyh , Lu, Guojun , Lackmann, Martin
Date: 2011
Type: Text , Conference paper
Relation: 26th Image and Vision Computing New Zealand Conference (IVCNZ 2011) p. 381 - 386
Full Text: false
Reviewed:
Description: Scale Invariant Feature Transform (SIFT) has been applied in numerous applications especially in the domain of computer vision. In these applications, image information used for building the SIFT descriptor can have a significant impact on its performance. When building orientation histograms for descriptors, a critical step is how to increment the values in the orientation bins. The original scheme for this step in SIFT was improved in [6]. Two different types of gradient information are used for building orientation histograms. The limitations of the two schemes are identified in this paper and we then propose three new schemes which use both types of gradient information in the feature description and matching stages. Our experimental results show that the proposed schemes can achieve better registration performances than the schemes proposed in SIFT and [6].

Showing items 61 - 80 of 110

Integration of LIDAR data and orthoimage for automatic 3D building roof plane extraction