A new perceptual dissimilarity measure for image retrieval and clustering
- Authors: Shojanazeri, Hamid
- Date: 2018
- Type: Text , Thesis , PhD
- Full Text:
- Description: Image retrieval and clustering are two important tools for analysing and organising images. Dissimilarity measure is central to both image retrieval and clustering. The performance of image retrieval and clustering algorithms depends on the effectiveness of the dissimilarity measure. ‘Minkowski’ distance, or more specifically, ‘Euclidean’ distance, is the most widely used dissimilarity measure in image retrieval and clustering. Euclidean distance depends only on the geometric position of two data instances in the feature space and completely ignores the data distribution. However, data distribution has an effect on human perception. The argument that two data instances in a dense area are more perceptually dissimilar than the same two instances in a sparser area, is proposed by psychologists. Based on this idea, a dissimilarity measure called, ‘mp’, has been proposed to address Euclidean distance’s limitation of ignoring the data distribution. Here, mp relies on data distribution to calculate the dissimilarity between two instances. As prescribed in mp, higher data mass between two data instances implies higher dissimilarity, and vice versa. mp relies only on data distribution and completely ignores the geometric distance in its calculations. In the aggregation of dissimilarities between two instances over all the dimensions in feature space, both Euclidean distance and mp give same priority to all the dimensions. This may result in a situation that the final dissimilarity between two data instances is determined by a few dimensions of feature vectors with relatively much higher values. As a result, the dissimilarity derived may not align well with human perception. The need to address the limitations of Minkowski distance measures, along with the importance of a dissimilarity measure that considers both geometric distance and the perceptual effect of data distribution in measuring dissimilarity between images motivated this thesis. It studies the performance of mp for image retrieval. It investigates a new dissimilarity measure that combines both Euclidean distance and data distribution. In addition to these, it studies the performance of such a dissimilarity measure for image retrieval and clustering. Our performance study of mp for image retrieval shows that relying only on data distribution to measure the dissimilarity results in some situations, where the mp’s measurement is contrary to human perception. This thesis introduces a new dissimilarity measure called, perceptual dissimilarity measure (PDM). PDM considers the perceptual effect of data distribution in combination with Euclidean distance. PDM has two variants, PDM1 and PDM2. PDM1 focuses on improving mp by weighting it using Euclidean distance in situations where mp may not retrieve accurate results. PDM2 considers the effect of data distribution on the perceived dissimilarity measured by Euclidean distance. PDM2 proposes a weighting system for Euclidean distance using a logarithmic transform of data mass. The proposed PDM variants have been used as alternatives to Euclidean distance and mp to improve the accuracy in image retrieval. Our results show that PDM2 has consistently performed the best, compared to Euclidean distance, mp and PDM1. PDM1’s performance was not consistent, although it has performed better than mp in all the experiments, but it could not outperform Euclidean distance in some cases. Following the promising results of PDM2 in image retrieval, we have studied its performance for image clustering. k-means is the most widely used clustering algorithm in scientific and industrial applications. k-medoids is the closest clustering algorithm to k-means. Unlike k-means which works only with Euclidean distance, k-medoids gives the option to choose the arbitrary dissimilarity measure. We have used Euclidean distance, mp and PDM2 as the dissimilarity measure in k-medoids and compared the results with k-means. Our clustering results show that PDM2 has perfromed overally the best. This confirms our retrieval results and identifies PDM2 as a suitable dissimilarity measure for image retrieval and clustering.
- Description: Doctor of Philosophy
A novel perceptual dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Zhang, Dengsheng , Teng, Shyh , Aryal, Sunil , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 Vol. 2018-November, p. 1-6
- Full Text: false
- Reviewed:
- Description: Similarity measure is an important research topic in image classification and retrieval. Given a type of image features, a good similarity measure should be able to retrieve similar images from the database while discard irrelevant images from the retrieval. Similarity measures in literature are typically distance based which measure the spatial distance between two feature vectors in high dimensional feature space. However, this type of similarity measures do not have any perceptual meaning and ignore the neighborhood influence in the similarity decision making process. In this paper, we propose a novel dissimilarity measure, which can measure both the distance and perceptual similarity of two image features in feature space. Results show the proposed similarity measure has a significant improvement over the traditional distance based similarity measure commonly used in literature.
- Description: International Conference Image and Vision Computing New Zealand
A Hybrid data dependent dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Teng, Shyh , Lu, Guojun
- Date: 2017
- Type: Text , Unpublished work
- Full Text:
- Description: Abstract— In image retrieval, an effective dissimilarity measure is required to retrieve the perceptually similar images. Minkowski-type (lp ) distance is widely used for image retrieval, however it has its limitations. It focuses on distance between image features and ignores the data distribution of the image features, which can play an important role in measuring perceptual similarity of images. !! also favours the most dominant components in calculating the total dissimilarity. A data dependent measure, named !! -dissimilarity, which estimates the dissimilarity using the data distribution, has been proposed recently. Rather than relying on geometric distance, it measures the dissimilarity between two instances in each dimension as a probability mass in a region that encloses the two instances. It considers two instances in a sparse region to be more similar than in a dense region. Using the probability of data mass enables all the dimensions of feature vectors to contribute in the final estimate of dissimilarity, so it does not just heavily bias towards the most dominant components. However, relying only on data distribution and completely ignoring the geometric distance raise another limitation. This can result in finding two instances similar only due to being in a sparse region, however if the geometric distance between them is large then they are not perceptually similar. To address this limitation we proposed a new hybrid data dependent dissimilarity (HDDD) measure that considers both data distribution as well as geometric distance. Our experimental results using Corel database and Caltech 101 show that (HDDD) leads to higher image retrieval performance than lp distance (lpD) and mp.
A new image dissimilarity measure incorporating human perception
- Authors: Shojanazeri, Hamid , Teng, Shyh , Aryal, Sunil , Zhang, Dengsheng , Lu, Guojun
- Date: 2018
- Type: Text , Unpublished work
- Full Text:
- Description: Pairwise (dis) similarity measure of data objects is central to many applications of image anlaytics, such as image retrieval and classification. Geometric distance, particularly Euclidean distance ((
Image clustering using a similarity measure incorporating human perception
- Authors: Shojanazeri, Hamid , Aryal, Sunil , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 p. 1-6
- Full Text: false
- Reviewed:
- Description: Clustering similar images is an important task in image processing and computer vision. It requires a measure to quantify pairwise similarities of images. The performance of clustering algorithm depends on the choice of similarity measure. In this paper, we investigate the effectiveness of data independent (distance-based), data-dependent (mass-based) and hybrid (dis)similarity measures in the image clustering task using three benchmark image collections with different sets of features. Our results of K-Medoids clustering show that uses the hybrid Perceptual Dissimilarity Measure (PMD) produces better clustering results than distance-based l(p) - norm and mass-based m(p) - dissimilarity.
A hybrid data dependent dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing - Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 141-148
- Full Text: false
- Reviewed:
- Description: In image retrieval, an effective dissimilarity (or similarity) measure is required to retrieve the perceptually similar images. Minkowski-type distance is widely used for image retrieval, however it has its limitation. It focuses on distance between image features and ignores the data distribution of the image features, which can play an important role in measuring perceptual similarity of images. To address this limitation, a data dependent measure named m-p, which calculates the dissimilarity using the data distribution rather than geometric distance has been proposed recently. It considers two instances in a sparse region to be more similar than in a dense region. Relying only on data distribution and completely ignoring the geometric distance raise other limitations. This may result in finding two perceptually dissimilar instances similar due to being located in a sparse region or vice versa. We proposed a new hybrid dissimilarity measure and experimental results show that it addresses these limitations.