A class centric feature and classifier ensemble selection approach for music genre classification
- Authors: Ariyaratne, Hasitha Bimsara , Zhang, Dengsheng , Lu, Guojun
- Date: 2012
- Type: Text , Conference paper
- Relation: Joint IAPR International Workshop SSPR & SPR 2012 p. 666-674
- Full Text: false
- Reviewed:
- Description: Music genre classification has attracted a lot of research interest due to the rapid growth of digital music. Despite the availability of a vast number of audio features and classification techniques, genre classification still remains a challenging task. In this work we propose a class centric feature and classifier ensemble selection method which deviates from the conventional practice of employing a single, or an ensemble of classifiers trained with a selected set of audio features. We adopt a binary decomposition technique to divide the multiclass problem into a set of binary problems which are then treated in a class specific manner. This differs from the traditional techniques which operate on the naive assumption that a specific set of features and/or classifiers can perform equally well in identifying all the classes. Experimental results obtained on a popular genre dataset and a newly created dataset suggest significant improvements over traditional techniques.
A geometric method to compute directionality features for texture images
- Authors: Islam, Md , Zhang, Dengsheng , Lu, Guojun
- Date: 2008
- Type: Text , Conference paper
- Relation: Proceedings of the 2008 IEEE International Conference on Multimedia and Expo p. 1521-1524
- Full Text: false
- Reviewed:
- Description: In content based image analysis and retrieval, texture feature is an essential component due to its strong discriminative power. Directionality is one of the most significant texture features which are well perceived by the human visual system. A new method to calculate the directionality of image is proposed in this paper. In contrast to Tamura method which uses the statistical property of the directional histogram of an image to calculate its directionality, the proposed method makes use of the geometric property of the directional histogram. Both subjective and objective analyses prove that the proposed method outperforms the conventional Tamura method. It has also been shown that the proposed directionality has better retrieval performance than the conventional Tamura directionality.
A hybrid data dependent dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing - Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 141-148
- Full Text: false
- Reviewed:
- Description: In image retrieval, an effective dissimilarity (or similarity) measure is required to retrieve the perceptually similar images. Minkowski-type distance is widely used for image retrieval, however it has its limitation. It focuses on distance between image features and ignores the data distribution of the image features, which can play an important role in measuring perceptual similarity of images. To address this limitation, a data dependent measure named m-p, which calculates the dissimilarity using the data distribution rather than geometric distance has been proposed recently. It considers two instances in a sparse region to be more similar than in a dense region. Relying only on data distribution and completely ignoring the geometric distance raise other limitations. This may result in finding two perceptually dissimilar instances similar due to being located in a sparse region or vice versa. We proposed a new hybrid dissimilarity measure and experimental results show that it addresses these limitations.
A kernel-based approach for content-based image retrieval
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand; Auckland, New Zealand; 19th-21st November 2018 p. 1-6
- Full Text: false
- Reviewed:
- Description: Content-based image retrieval (CBIR) is a popular approach to retrieve images based on a query. In CBIR, retrieval is executed based on the properties of image contents (e.g. gradient, shape, color, texture) which are generally encoded into image descriptors. Among the various image descriptors, histogram-based descriptors are very popular. However, they suffer from the limitation of coarse quantization. In contrast, the use of kernel descriptors (KDES) is proven to be more effective than histogram-based descriptors in other applications, e.g. image classification. This is because, in the KDES framework, instead of the quantization of pixel attributes, each pixel equally takes part in the similarity measurement between two images. In this paper, we propose an approach for how the conventional KDES and its improved version can be used for CBIR. In addition, we have provided a detailed insight into the effectiveness of improved kernel descriptors. Finally, our experiment results will show that kernel descriptors are significantly more effective than histogram-based descriptors in CBIR.
A new image dissimilarity measure incorporating human perception
- Authors: Shojanazeri, Hamid , Teng, Shyh , Aryal, Sunil , Zhang, Dengsheng , Lu, Guojun
- Date: 2018
- Type: Text , Unpublished work
- Full Text:
- Description: Pairwise (dis) similarity measure of data objects is central to many applications of image anlaytics, such as image retrieval and classification. Geometric distance, particularly Euclidean distance ((
A novel automatic hierachical approach to music genre classification
- Authors: Ariyaratne, Hasitha , Zhang, Dengsheng
- Date: 2012
- Type: Text , Conference paper
- Relation: 2012 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)
- Full Text: false
- Reviewed:
- Description: Automatic music genre classification is an important component in Music Information Retrieval (MIR). It has gained lot of attention lately due to the rapid growth in the use of digital music. Past work in this area has already produced a number of audio features and classification techniques; however, genre classification still remains an unsolved problem. In this paper we explore a hybrid unsupervised/supervised top-down hierarchical classification approach. Most existing work on hierarchical music genre classification relies on human built trees and taxonomies; however these hierarchies may not always translate well into machine classification problems. Therefore, we explore an automatic approach to construct a classification tree through subspace cluster analysis. Experimental results validate the tree building algorithm and provide a new research direction for automatic genre classification. We also addressed the issue of scarcity in publicly available music datasets, by introducing a new dataset containing genre, artist and album labels.
A novel fusion approach in the extraction of kernel descriptor with improved effectiveness and efficiency
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2021
- Type: Text , Journal article
- Relation: Multimedia Tools and Applications Vol. 80, no. 10 (Apr 2021), p. 14545-14564
- Full Text:
- Reviewed:
- Description: Image representation using feature descriptors is crucial. A number of histogram-based descriptors are widely used for this purpose. However, histogram-based descriptors have certain limitations and kernel descriptors (KDES) are proven to overcome them. Moreover, the combination of more than one KDES performs better than an individual KDES. Conventionally, KDES fusion is performed by concatenating them after the gradient, colour and shape descriptors have been extracted. This approach has limitations in regard to the efficiency as well as the effectiveness. In this paper, we propose a novel approach to fuse different image features before the descriptor extraction, resulting in a compact descriptor which is efficient and effective. In addition, we have investigated the effect on the proposed descriptor when texture-based features are fused along with the conventionally used features. Our proposed descriptor is examined on two publicly available image databases and shown to provide outstanding performances.
A novel perceptual dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Zhang, Dengsheng , Teng, Shyh , Aryal, Sunil , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 Vol. 2018-November, p. 1-6
- Full Text: false
- Reviewed:
- Description: Similarity measure is an important research topic in image classification and retrieval. Given a type of image features, a good similarity measure should be able to retrieve similar images from the database while discard irrelevant images from the retrieval. Similarity measures in literature are typically distance based which measure the spatial distance between two feature vectors in high dimensional feature space. However, this type of similarity measures do not have any perceptual meaning and ignore the neighborhood influence in the similarity decision making process. In this paper, we propose a novel dissimilarity measure, which can measure both the distance and perceptual similarity of two image features in feature space. Results show the proposed similarity measure has a significant improvement over the traditional distance based similarity measure commonly used in literature.
- Description: International Conference Image and Vision Computing New Zealand
A review on automatic image annotation techniques
- Authors: Zhang, Dengsheng , Islam, Md , Lu, Guojun
- Date: 2012
- Type: Text , Journal article
- Relation: Pattern Recognition Letters Vol. 45, no. 1 (2012), p. 346-362
- Full Text: false
- Reviewed:
- Description: Nowadays, more and more images are available. However, to find a required image for an ordinary user is a challenging task. Large amount of researches on image retrieval have been carried out in the past two decades. Traditionally, research in this area focuses on content based image retrieval. However, recent research shows that there is a semantic gap between content based image retrieval and image semantics understandable by humans. As a result, research in this area has shifted to bridge the semantic gap between low level image features and high level semantics. The typical method of bridging the semantic gap is through the automatic image annotation (AIA) which extracts semantic features using machine learning techniques. In this paper, we focus on this latest development in image retrieval and provide a comprehensive survey on automatic image annotation. We analyse key aspects of the various AIA methods, including both feature extraction and semantic learning methods. Major methods are discussed and illustrated in details. We report our findings and provide future research directions in the AIA area in the conclusions
A Rotation invariant HOG descriptor for tire pattern image classification
- Authors: Liu, Ying , Ge, Yuxiang , Wang, Fuping , Liu, Qiqi , Lei, Yanbo , Zhang, Dengsheng , Lu, Guojun
- Date: 2019
- Type: Text , Conference proceedings
- Relation: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); Brighton, UK, 12-17 May 2019. p. 2412-2416
- Full Text: false
- Reviewed:
- Description: Texture feature is important in describing tire pattern image which provides useful clue in solving crime cases and traffic accidents. In this paper, we propose a novel texture feature extraction method based on HOG (Histogram of Oriented Gradient) and dominant gradient (DG) in tire pattern images, named HOG-DG. The proposed HOG-DG is not only robust to illumination and scale changes but also is rotation-invariant. In the proposed HOG-DG, HOG features are first computed from circular local cells, and HOG features from an image are concatenated and normalized using the DG to construct the HOG-DG feature. HOG-DG is used to train a support-vector-machine (SVM) classifier for tire pattern classification. Experimental results demonstrate its outstanding performance for tire pattern description.
A survey of audio-based music classification and annotation
- Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
- Date: 2011
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia Vol. 13, no. 2 (2011), p. 303-319
- Full Text: false
- Reviewed:
- Description: Music information retrieval (MIR) is an emerging research area that receives growing attention from both the research community and music industry. It addresses the problem of querying and retrieving certain types of music from large music data set. Classification is a fundamental problem in MIR. Many tasks in MIR can be naturally cast in a classification setting, such as genre classification, mood classification, artist recognition, instrument recognition, etc. Music annotation, a new research area in MIR that has attracted much attention in recent years, is also a classification problem in the general sense. Due to the importance of music classification in MIR research, rapid development of new methods, and lack of review papers on recent progress of the field, we provide a comprehensive review on audio-based classification in this paper and systematically summarize the state-of-the-art techniques for music classification. Specifically, we have stressed the difference in the features and the types of classifiers used for different classification tasks. This survey emphasizes on recent development of the techniques and discusses several open issues for future research.
A survey on image classification of lightweight convolutional neural network
- Authors: Liu, Ying , Xiao, Peng , Fang, Jie , Zhang, Dengsheng
- Date: 2023
- Type: Text , Conference paper
- Relation: 19th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, ICNC-FSKD 2023, Harbin, China, 29-31 July 2023, 2023 19th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)
- Full Text: false
- Reviewed:
- Description: In recent years, deep neural networks have achieved tremendous success in image classification in both academic and industrial settings. However, the high hardware requirements imposed by their intensive and complex computations pose a challenge for deployment on low-storage devices. To address this challenge, lightweight networks provide a viable solution. This paper provides a detailed review of recent lightweight image classification algorithms, which can be categorized into low-redundancy network model design and neural network compression algorithms. The former reduces network computations by replacing traditional convolution with efficient lightweight convolution, while the latter reduces redundancy in the network by employing methods such as network pruning, knowledge distillation, and parameter quantization. We summarize the experimental results of some classical models and algorithms on ImageNet2012 and CIFAR-10 datasets, and analyze the characteristics, advantages and disadvantages of these models respectively. Finally, future research directions for lightweight algorithms in the field of image classification are identified. © 2023 IEEE.
An annotation rule extraction algorithm for image retrieval
- Authors: Chen, Zeng , Hou, Jin , Zhang, Dengsheng , Qin, Xue
- Date: 2012
- Type: Text , Journal article
- Relation: Pattern Recognition Letters Vol. 33, no. 10 (2012), p.1257-1268
- Full Text: false
- Reviewed:
- Description: Automatic image annotation can be used to facilitate semantic search in large image databases. However, retrieval performance of the existing annotation schemes is far from the users’ expectation. In this paper, we propose a novel method to automatically annotate image through the rules generated by support vector machines and decision trees. In order to obtain the rules, we collect a set of training regions by image segmentation, feature extraction and discretization. We first employ a support vector machine as a preprocessing technique to refine the input training data and then use it to improve the rules generated by decision tree learning. The preprocessing can effectively deal with the similar regions in an image as well. Moreover, we integrate the original rules to the modified ones, so as to formulate the complete and effective annotation rules. We can translate an unknown image into text by this algorithm, and the proposed system can retrieve images queried by both images and keywords. Experiments are carried out in a standard Corel dataset and images collected from the Web to test the accuracy and robustness of the proposed system. Experimental results show the proposed algorithm can annotate and retrieve images more efficiently than traditional learning algorithms.
An enhancement to closed-form method for natural image matting
- Authors: Zhu, Jun , Zhang, Dengsheng , Lu, Guojun
- Date: 2010
- Type: Text , Conference paper
- Relation: Proceedings of the 2010 Digital Image Computing: Techniques and Applications p. 629-634
- Full Text: false
- Reviewed:
- Description: Natural image matting is a task to estimate fractional opacity of foreground layer from an image. Many matting methods have been proposed, and most of them are trimap-based. Among these methods, closed-form matting offers both trimap-based and scribble-based matting. However, the closed-form method causes significant errors at background-hole regions due to over-smoothing. In this paper, we identify the source of the problem and propose our solution to enhance the closed-form method. Experiments show that our enhanced method can improve the accuracy for trimap-based images and obtain similar result to the closed-form method for scribble-based matting.
An enhancement to the spatial pyramid matching for image classification and retrieval
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Access Vol. 8, no. (2020), p. 22463-22472
- Full Text:
- Reviewed:
- Description: Spatial pyramid matching (SPM) is one of the widely used methods to incorporate spatial information into the image representation. Despite its effectiveness, the traditional SPM is not rotation invariant. A rotation invariant SPM has been proposed in the literature but it has many limitations regarding the effectiveness. In this paper, we investigate how to make SPM robust to rotation by addressing those limitations. In an SPM framework, an image is divided into an increasing number of partitions at different pyramid levels. In this paper, our main focus is on how to partition images in such a way that the resulting structure can deal with image-level rotations. To do that, we investigate three concentric ring partitioning schemes. Apart from image partitioning, another important component of the SPM framework is a weight function. To apportion the contribution of each pyramid level to the final matching between two images, the weight function is needed. In this paper, we propose a new weight function which is suitable for the rotation-invariant SPM structure. Experiments based on image classification and retrieval are performed on five image databases. The detailed result analysis shows that we are successful in enhancing the effectiveness of SPM for image classification and retrieval. © 2013 IEEE.
Automatic categorization of image regions using dominant color based vector quantization
- Authors: Islam, Md , Zhang, Dengsheng , Lu, Guojun
- Date: 2008
- Type: Text , Conference paper
- Relation: Proceedings of the Digital Image Computing: Techniques and Applications p. 191-198
- Full Text: false
- Reviewed:
- Description: This paper proposes a dominant color based vector quantization algorithm that automatically categorizes image regions. In contrast to the conventional vector quantization algorithm, the new algorithm effectively handles variable feature vectors like dominant color descriptors. Furthermore, the algorithm is guided by a novel splitting and stopping criterion which is specially designed for dominant color descriptors. This criterion helps the algorithm not only to learn the number of clusters, but also to avoid unnecessary over-fragmentations of region-clusters. Experimental result shows that the proposed approach categorizes image-regions with very high accuracy.
Automatic image annotation based on decision tree machine learning
- Authors: Jiang, Lixing , Hou, Jin , Zeng, Chen , Zhang, Dengsheng
- Date: 2009
- Type: Text , Conference paper
- Relation: Proceedings of the International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery p. 170-175
- Full Text: false
- Reviewed:
- Description: With the rapid development of digital imaging technology, image annotation is an important and challenging task in image retrieval. At present, many machine learning methods have been applied to solve the problem of automatic image annotation (AIA). However, there exists enormous semantic expressive gap between the low-level image features and high-level semantic concepts. Due to the problem, the annotation performance of existing methods is not satisfactory, and needs to be further improved. This paper proposes an automatic annotation framework via a novel decision tree-based Bayesian (DTB) machine learning algorithm. It is a hybrid approach that attempts to utilize the advantages of both DT and Naive-Bayesian (NB). We firstly segment an image into different regions and extract low-level features of each region. From these features, high-level semantic concepts are obtained using a DTB learning algorithm. Finally, experiments conducted on the Corel dataset demonstrate the effectiveness of DTB machine learning. The DTB can not only enhance the classification accuracy, but also associate low-level region features with high-level image concepts. This method presents the advantages of the Bayesian method and the DT. Moreover, this semantic interpretation capability is a natural simulation of human learning.
Automatic image search based on improved feature descriptors and decision tree
- Authors: Hou, Jin , Chen, Zeng , Qin, Xue , Zhang, Dengsheng
- Date: 2011
- Type: Text , Journal article
- Relation: Integrated Computer-Aided Engineering Vol. 18, no. 2 (2011), p. 167-180
- Full Text: false
- Reviewed:
- Description: There has been a growing interest in implementing image search engine at the semantic level. However, most existing practical systems including popular commercial image search engines like Google and Yahoo! are either text-based or a simple hybrid of texts and visual features. This paper proposes a novel image search system based on automatic image annotation. We develop a technology which learns semantic image concepts from image contents and transforms unstructured images into textual documents, so that images are indexed and retrieved in the same way as textual documents. Existing database management systems can be used to effectively manage image contents, and image search can be as efficient as text search by transforming images into textual documents through machine learning. Experiments in both the Corel dataset and real Web dataset are performed to validate our system and the results are promising. This system suggests a new combination of texts and visual features in order to achieve a semantic image search, and is expected to become a re-ranking system to the existing image search result via the Internet.
Building sparse support vector machines for multi-instance classification
- Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
- Date: 2011
- Type: Text , Conference paper
- Relation: European Conference on Machine Learning Knowledge Discovery in Databases (ECML PKDD) p. 471-486
- Full Text: false
- Reviewed:
- Description: We propose a direct approach to learning sparse Support Vector Machine (SVM) prediction models for Multi-Instance (MI) classification. The proposed sparse SVM is based on a “label-mean” formulation of MI classification which takes the average of predictions of individual instances for bag-level prediction. This leads to a convex optimization problem, which is essential for the tractability of the optimization problem arising from the sparse SVM formulation we derived subsequently, as well as the validity of the optimization strategy we employed to solve it. Based on the “label-mean” formulation, we can build sparse SVM models for MI classification and explicitly control their sparsities by enforcing the maximum number of expansions allowed in the prediction function. An effective optimization strategy is adopted to solve the formulated sparse learning problem which involves the learning of both the classifier and the expansion vectors. Experimental results on benchmark data sets have demonstrated that the proposed approach is effective in building very sparse SVM models while achieving comparable performance to the state-of-the-art MI classifiers.
Combining pyramid match kernel and spatial pyramid for image classification
- Authors: Karmakar, Priyabrata , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun , Liu, Ying
- Date: 2016
- Type: Text
- Relation: 2016 International Conference on Digital Image Computing: Techniques and Applications (Dicta); Gold Coast, Australia; 30th November-2nd December 2016 p. 486-493
- Full Text: false
- Reviewed:
- Description: This paper proposes a new approach for image classification by combining pyramid match kernel (PMK) with spatial pyramid. Unlike the conventional spatial pyramid matching (SPM) approach which only uses a single-resolution feature vector to represent an image, we use a multi-resolution feature vector to represent an image for SPM. We then calculate the match scores at each resolution of SPM representation and finally compute the matching between two images by applying the concept of PMK using the match scores obtained from the multiple resolutions. Our experimental results show that the proposed combined pyramid matching achieves a significant improvement on classification performance.