Image clustering using a similarity measure incorporating human perception
- Authors: Shojanazeri, Hamid , Aryal, Sunil , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 p. 1-6
- Full Text: false
- Reviewed:
- Description: Clustering similar images is an important task in image processing and computer vision. It requires a measure to quantify pairwise similarities of images. The performance of clustering algorithm depends on the choice of similarity measure. In this paper, we investigate the effectiveness of data independent (distance-based), data-dependent (mass-based) and hybrid (dis)similarity measures in the image clustering task using three benchmark image collections with different sets of features. Our results of K-Medoids clustering show that uses the hybrid Perceptual Dissimilarity Measure (PMD) produces better clustering results than distance-based l(p) - norm and mass-based m(p) - dissimilarity.
A novel automatic hierachical approach to music genre classification
- Authors: Ariyaratne, Hasitha , Zhang, Dengsheng
- Date: 2012
- Type: Text , Conference paper
- Relation: 2012 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)
- Full Text: false
- Reviewed:
- Description: Automatic music genre classification is an important component in Music Information Retrieval (MIR). It has gained lot of attention lately due to the rapid growth in the use of digital music. Past work in this area has already produced a number of audio features and classification techniques; however, genre classification still remains an unsolved problem. In this paper we explore a hybrid unsupervised/supervised top-down hierarchical classification approach. Most existing work on hierarchical music genre classification relies on human built trees and taxonomies; however these hierarchies may not always translate well into machine classification problems. Therefore, we explore an automatic approach to construct a classification tree through subspace cluster analysis. Experimental results validate the tree building algorithm and provide a new research direction for automatic genre classification. We also addressed the issue of scarcity in publicly available music datasets, by introducing a new dataset containing genre, artist and album labels.
Comparison of curvelet and wavelet texture features for content based image retrieval
- Authors: Sumana, Ishrat , Lu, Guojun , Zhang, Dengsheng
- Date: 2012
- Type: Text , Conference paper
- Relation: 2012 IEEE International Conference on Multimedia and Expo (ICME) p. 290-295
- Full Text: false
- Reviewed:
- Description: Texture feature plays a vital role in content based Image retrieval (CBIR). Wavelet texture feature modeled by generalized Gaussian density (GGD) [1] performs better than discrete wavelet texture feature. Curve let texture feature was proposed in [2]. In this paper, we compute a new texture feature by applying the generalized Gaussian density to the distribution of curve let coefficients which we call curve let GGD texture feature. The purpose of this paper is to investigate curve let GGD texture feature and compare its retrieval performance with that of curve let, wavelet and wavelet GGD texture features. Experimental results show that both curve let and curve let GGD features perform significantly better than wavelet and wavelet GGD texture features. Among the two types of curve let based features, curve let feature shows better performance in CBIR than curve let GGD texture feature. The findings are discussed in the paper.
Combining pyramid match kernel and spatial pyramid for image classification
- Authors: Karmakar, Priyabrata , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun , Liu, Ying
- Date: 2016
- Type: Text
- Relation: 2016 International Conference on Digital Image Computing: Techniques and Applications (Dicta); Gold Coast, Australia; 30th November-2nd December 2016 p. 486-493
- Full Text: false
- Reviewed:
- Description: This paper proposes a new approach for image classification by combining pyramid match kernel (PMK) with spatial pyramid. Unlike the conventional spatial pyramid matching (SPM) approach which only uses a single-resolution feature vector to represent an image, we use a multi-resolution feature vector to represent an image for SPM. We then calculate the match scores at each resolution of SPM representation and finally compute the matching between two images by applying the concept of PMK using the match scores obtained from the multiple resolutions. Our experimental results show that the proposed combined pyramid matching achieves a significant improvement on classification performance.
A hybrid data dependent dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing - Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 141-148
- Full Text: false
- Reviewed:
- Description: In image retrieval, an effective dissimilarity (or similarity) measure is required to retrieve the perceptually similar images. Minkowski-type distance is widely used for image retrieval, however it has its limitation. It focuses on distance between image features and ignores the data distribution of the image features, which can play an important role in measuring perceptual similarity of images. To address this limitation, a data dependent measure named m-p, which calculates the dissimilarity using the data distribution rather than geometric distance has been proposed recently. It considers two instances in a sparse region to be more similar than in a dense region. Relying only on data distribution and completely ignoring the geometric distance raise other limitations. This may result in finding two perceptually dissimilar instances similar due to being located in a sparse region or vice versa. We proposed a new hybrid dissimilarity measure and experimental results show that it addresses these limitations.
A Rotation invariant HOG descriptor for tire pattern image classification
- Authors: Liu, Ying , Ge, Yuxiang , Wang, Fuping , Liu, Qiqi , Lei, Yanbo , Zhang, Dengsheng , Lu, Guojun
- Date: 2019
- Type: Text , Conference proceedings
- Relation: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); Brighton, UK, 12-17 May 2019. p. 2412-2416
- Full Text: false
- Reviewed:
- Description: Texture feature is important in describing tire pattern image which provides useful clue in solving crime cases and traffic accidents. In this paper, we propose a novel texture feature extraction method based on HOG (Histogram of Oriented Gradient) and dominant gradient (DG) in tire pattern images, named HOG-DG. The proposed HOG-DG is not only robust to illumination and scale changes but also is rotation-invariant. In the proposed HOG-DG, HOG features are first computed from circular local cells, and HOG features from an image are concatenated and normalized using the DG to construct the HOG-DG feature. HOG-DG is used to train a support-vector-machine (SVM) classifier for tire pattern classification. Experimental results demonstrate its outstanding performance for tire pattern description.
Distortion robust image classification using deep convolutional neural network with discrete cosine transform
- Authors: Hossain, Md Tahmid , Teng, Shyh Wei , Zhang, Dengsheng , Lim, Suryani , Lu, Guojun
- Date: 2019
- Type: Text , Conference proceedings
- Relation: 2019 IEEE International Conference on Image Processing (ICIP);Taipei, Taiwan; 22-25 Sept, 2019 p. 659-663
- Full Text: false
- Reviewed:
- Description: Convolutional Neural Networks are highly effective for image classification. However, it is still vulnerable to image distortion. Even a small amount of noise or blur can severely hamper the performance of these CNNs. Most work in the literature strives to mitigate this problem simply by fine-tuning a pre-trained CNN on mutually exclusive or a union set of distorted training data. This iterative fine-tuning process with all known types of distortion is exhaustive and the network struggles to handle unseen distortions. In this work, we propose distortion robust DCT-Net, a Discrete Cosine Transform based module integrated into a deep network which is built on top of VGG16 [1]. Unlike other works in the literature, DCT-Net is "blind" to the distortion type and level in an image both during training and testing. The DCT-Net is trained only once and applied in a more generic situation without further retraining. We also extend the idea of dropout and present a training adaptive version of the same. We evaluate our proposed DCT-Net on a number of benchmark datasets. Our experimental results show that once trained, DCT-Net not only generalizes well to a variety of unseen distortions but also outperforms other comparable networks in the literature.