A new image dissimilarity measure incorporating human perception
- Authors: Shojanazeri, Hamid , Teng, Shyh , Aryal, Sunil , Zhang, Dengsheng , Lu, Guojun
- Date: 2018
- Type: Text , Unpublished work
- Full Text:
- Description: Pairwise (dis) similarity measure of data objects is central to many applications of image anlaytics, such as image retrieval and classification. Geometric distance, particularly Euclidean distance ((
Enhancing image registration performance by incorporating distribution and spatial distance of local descriptors
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Pattern Recognition Letters Vol. 103, no. (2018), p. 46-52
- Full Text: false
- Reviewed:
- Description: A data dependency similarity measure called mp-dissimilarity has been recently proposed. Unlike ℓp-norm distance which is widely used in calculating the similarity between vectors, mp-dissimilarity takes into account the relative positions of the two vectors with respect to the rest of the data. This paper investigates the potential of mp-dissimilarity in matching local image descriptors. Moreover, three new matching strategies are proposed by considering both ℓp-norm distance and mp-dissimilarity. Our proposed matching strategies are extensively evaluated against ℓp-norm distance and mp-dissimilarity on a few benchmark datasets. Experimental results show that mp-dissimilarity is a promising alternative to ℓp-norm distance in matching local descriptors. The proposed matching strategies outperform both ℓp-norm distance and mp-dissimilarity in matching accuracy. One of our proposed matching strategies is comparable to ℓp-norm distance in terms of recall vs 1-precision. © 2018 Elsevier B.V.
Learning sparse kernel classifiers in the primal
- Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
- Date: 2012
- Type: Text , Conference paper
- Relation: Joint IAPR International Workshop, SSPR&SPR 2012; Hiroshima, Japan; 7th-9th November 2012; published in Structural, Syntactic, and Statistical Pattern Recognition (part of the Lecture Notes in Computer Science) Vol. 7626, p. 60-69
- Full Text: false
- Reviewed:
- Description: The increasing number of classification applications in large data sets demands that efficient classifiers be designed not only in training but also for prediction. In this paper, we address the problem of learning kernel classifiers with reduced complexity and improved efficiency for prediction in comparison to those trained by standard methods. A single optimisation problem is formulated for classifier learning which optimises both classifier weights and eXpansion Vectors (XVs) that define the classification function in a joint fashion. Unlike the existing approach of Wu et al, which performs optimisation in the dual formulation, our approach solves the primal problem directly. The primal problem is much more efficient to solve, as it can be converted to the training of a linear classifier in each iteration, which scales linearly to the size of the data set and the number of expansions. This makes our primal approach highly desirable for large-scale applications, where the dual approach is inadequate and prohibitively slow due to the solution of cubic-time kernel SVM involved in each iteration. Experimental results have demonstrated the efficiency and effectiveness of the proposed primal approach for learning sparse kernel classifiers that clearly outperform the alternatives.
COREG : A corner based registration technique for multimodal images
- Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
- Date: 2018
- Type: Text , Journal article
- Relation: Multimedia Tools and Applications Vol. 77, no. 10 (2018), p. 12607-12634
- Full Text: false
- Reviewed:
- Description: This paper presents a COrner based REGistration technique for multimodal images (referred to as COREG). The proposed technique focuses on addressing large content and scale differences in multimodal images. Unlike traditional multimodal image registration techniques that rely on intensities or gradients for feature representation, we propose to use contour-based corners. First, curvature similarity between corners are for the first time explored for the purpose of multimodal image registration. Second, a novel local descriptor called Distribution of Edge Pixels Along Contour (DEPAC) is proposed to represent the edges in the neighborhood of corners. Third, a simple yet effective way of estimating scale difference is proposed by making use of geometric relationships between corner triplets from the reference and target images. Using a set of benchmark multimodal images and multimodal microscopic images, we will demonstrate that our proposed technique outperforms a state-of-the-art multimodal image registration technique. © 2017, Springer Science+Business Media, LLC.
Enhancing the effectiveness of local descriptor based image matching
- Authors: Hossain, Md Tahmid , Teng, Shyh , Zhang, Dengsheng , Lim, Suryani , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018; Canberra, Australia; 10th-13th December 2018 p. 1-8
- Full Text: false
- Reviewed:
- Description: Image registration has received great attention from researchers over the last few decades. SIFT (Scale Invariant Feature Transform), a local descriptor-based technique is widely used for registering and matching images. To establish correspondences between images, SIFT uses a Euclidean Distance ratio metric. However, this approach leads to a lot of incorrect matches and eliminating these inaccurate matches has been a challenge. Various methods have been proposed attempting to mitigate this problem. In this paper, we propose a scale and orientation harmony-based pruning method that improves image matching process by successfully eliminating incorrect SIFT descriptor matches. Moreover, our technique can predict the image transformation parameters based on a novel adaptive clustering method with much higher matching accuracy. Our experimental results have shown that the proposed method has achieved averages of approximately 16% and 10% higher matching accuracy compared to the traditional SIFT and a contemporary method respectively.
- Description: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
Cuboid segmentation for effective image retrieval
- Authors: Murshed, Manzur , Teng, Shyh , Lu, Guojun
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing : Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 884-891
- Full Text: false
- Reviewed:
- Description: Region-based image retrieval has been proven to be effective in finding relevant images. In this paper, we propose a cuboid im-age segmentation method which results in rectangle image partitions. Rectangle partitions are more suitable for image compression, retrieval and other image operations. We apply partitions in image retrieval in this paper. Our experimental results have shown that (1) the proposed partitioning method is effective in segmenting images into meaningful rectangles; (2) using colour partitions for image retrieval is more effective than using whole images; and (3) the partitioned approach has additional advantage of letting users to select certain objects/colours as queries to find more relevant images/objects. These three advantages could be important in crime scene investigation image indexing and retrieval. Moreover, the proposed technique is amenable to compressed-domain applications.
Multimodal image registration technique based on improved local feature descriptors
- Authors: Teng, Shyh , Hossain, Tanvir , Lu, Guojun
- Date: 2015
- Type: Text , Journal article
- Relation: Journal of Electronic Imaging Vol. 24, no. 1 (2015), p.
- Full Text:
- Reviewed:
- Description: Multimodal image registration has received significant research attention over the past decade, and the majority of the techniques are global in nature. Although local techniques are widely used for general image registration, there are only limited studies on them for multimodal image registration. Scale invariant feature transform (SIFT) is a well-known general image registration technique. However, SIFT descriptors are not invariant to multimodality. We propose a SIFT-based technique that is modality invariant and still retains the strengths of local techniques. Moreover, our proposed histogram weighting strategies also improve the accuracy of descriptor matching, which is an important image registration step. As a result, our proposed strategies can not only improve the multimodal registration accuracy but also have the potential to improve the performance of all SIFT-based applications, e.g., general image registration and object recognition.
An enhancement to the spatial pyramid matching for image classification and retrieval
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Access Vol. 8, no. (2020), p. 22463-22472
- Full Text:
- Reviewed:
- Description: Spatial pyramid matching (SPM) is one of the widely used methods to incorporate spatial information into the image representation. Despite its effectiveness, the traditional SPM is not rotation invariant. A rotation invariant SPM has been proposed in the literature but it has many limitations regarding the effectiveness. In this paper, we investigate how to make SPM robust to rotation by addressing those limitations. In an SPM framework, an image is divided into an increasing number of partitions at different pyramid levels. In this paper, our main focus is on how to partition images in such a way that the resulting structure can deal with image-level rotations. To do that, we investigate three concentric ring partitioning schemes. Apart from image partitioning, another important component of the SPM framework is a weight function. To apportion the contribution of each pyramid level to the final matching between two images, the weight function is needed. In this paper, we propose a new weight function which is suitable for the rotation-invariant SPM structure. Experiments based on image classification and retrieval are performed on five image databases. The detailed result analysis shows that we are successful in enhancing the effectiveness of SPM for image classification and retrieval. © 2013 IEEE.
A robust gradient based method for building extraction from LiDAR and photogrammetric imagery
- Authors: Siddiqui, Fasahat , Teng, Shyh , Awrangjeb, Mohammad , Lu, Guojun
- Date: 2016
- Type: Text , Journal article
- Relation: Sensors (Switzerland) Vol. 16, no. 7 (2016), p. 1-24
- Full Text:
- Reviewed:
- Description: Existing automatic building extraction methods are not effective in extracting buildings which are small in size and have transparent roofs. The application of large area threshold prohibits detection of small buildings and the use of ground points in generating the building mask prevents detection of transparent buildings. In addition, the existingmethods use numerous parameters to extract buildings in complex environments, e.g.,hilly area and high vegetation. However, the empirical tuning of large number of parameters reduces the robustness of building extraction methods. This paper proposes a novel Gradient-based Building Extraction (GBE) method to address these limitations. The proposed method transforms the Light Detection And Ranging (LiDAR) height information into intensity image without interpolation of point heights and then analyses the gradient information in the image. Generally, building roof planes have a constant height change along the slope of a roof plane whereas trees have a random height change. With such an analysis, buildings of a greater range of sizes with a transparent or opaque roof can be extracted. In addition, a local colour matching approach is introduced as a post-processing stage to eliminate trees. This stage of our proposed method does not require any manual setting and all parameters are set automatically from the data. The other post processing stages including variance, point density and shadow elimination are also applied to verify the extracted buildings, where comparatively fewer empirically set parameters are used. The performance of the proposed GBE method is evaluated on two benchmark data sets by using the object and pixel based metrics (completeness, correctness and quality). Our experimental results show the effectiveness of the proposed method in eliminating trees, extracting buildings of all sizes, and extracting buildings with and without transparent roof. When compared with current state-of-the-art building extraction methods, the proposed method outperforms the existing methods in various evaluation metrics. © 2016 by the authors; licensee MDPI, Basel, Switzerland.
Improving SIFT's performance by incorporating appropriate gradient information
- Authors: Lv, Guohua , Hossain, Md. Tanvir , Teng, Shyh , Lu, Guojun , Lackmann, Martin
- Date: 2011
- Type: Text , Conference paper
- Relation: 26th Image and Vision Computing New Zealand Conference (IVCNZ 2011) p. 381 - 386
- Full Text: false
- Reviewed:
- Description: Scale Invariant Feature Transform (SIFT) has been applied in numerous applications especially in the domain of computer vision. In these applications, image information used for building the SIFT descriptor can have a significant impact on its performance. When building orientation histograms for descriptors, a critical step is how to increment the values in the orientation bins. The original scheme for this step in SIFT was improved in [6]. Two different types of gradient information are used for building orientation histograms. The limitations of the two schemes are identified in this paper and we then propose three new schemes which use both types of gradient information in the feature description and matching stages. Our experimental results show that the proposed schemes can achieve better registration performances than the schemes proposed in SIFT and [6].
High quality region-of-interest coding for video conferencing based remote general practitioner training
- Authors: Murshed, Manzur , Siddique, Md Atiur Rahman , Islam, Saikat , Ali, Mortuza , Lu, Guojun , Villanueva, Elmer , Brown, James
- Date: 2013
- Type: Text , Conference paper
- Relation: Proceedings of the International Conference on eHealth, Telemedicine, and Social Medicine (eTELEMED 2013), Wilmington, DE, 1st October 2013 pg 240-245
- Full Text: false
- Reviewed:
Region-based image retrieval with high-level semantics using decision tree learning
- Authors: Liu, Ying , Zhang, Dengsheng , Lu, Guojun
- Date: 2008
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 41, no. 8 (2008), p. 2554-2570
- Full Text: false
- Reviewed:
- Description: Semantic-based image retrieval has attracted great interest in recent years. This paper proposes a region-based image retrieval system with high-level semantic learning. The key features of the system are: (1) it supports both query by keyword and query by region of interest. The system segments an image into different regions and extracts low-level features of each region. From these features, high-level concepts are obtained using a proposed decision tree-based learning algorithm named DT-ST. During retrieval, a set of images whose semantic concept matches the query is returned. Experiments on a standard real-world image database confirm that the proposed system significantly improves the retrieval performance, compared with a conventional content-based image retrieval system. (2) The proposed decision tree induction method DT-ST for image semantic learning is different from other decision tree induction algorithms in that it makes use of the semantic templates to discretize continuous-valued region features and avoids the difficult image feature discretization problem. Furthermore, it introduces a hybrid tree simplification method to handle the noise and tree fragmentation problems, thereby improving the classification performance of the tree. Experimental results indicate that DT-ST outperforms two well-established decision tree induction algorithms ID3 and C4.5 in image semantic learning.
Music emotion annotation by machine learning
- Authors: Cheung, Wai , Lu, Guojun
- Date: 2008
- Type: Text , Conference paper
- Relation: Proceedings of the 2008 IEEE 10th Workshop on Multimedia Signal Processing p. 580-585
- Full Text: false
- Reviewed:
- Description: Music emotion annotation is a task of attaching emotional terms to musical works. As volume of online musical contents expands rapidly in recent years, demands for retrieval by emotion are emerging. Currently, literature on music retrieval using emotional terms is rare. Emotion annotated data are scarce in existing music databases because annotation is still a manual task. Automating music emotion annotation is an essential prerequisite to research in music retrieval by emotion, for without which even sophisticated retrieval methods may not be very useful in a data deficient environment. This paper describes a machine learning approach to annotate music using a large number of emotional terms. We also estimate the training data size requirements for a workable annotation system. Our empirical result shows that 1) the task of music emotion annotation could be modelled using machine learning techniques to support a large number of emotional terms, 2) the combination of sampling method and data-driven detection threshold is highly effective in optimizing the use of existing annotated data in training machine learning models, 3) synonymous relationships enhance the annotation performance and 4) the training data size requirement is within reach for a workable annotation system. Essentially, automatic music emotion annotation enables music retrieval by emotion to be performed as a text retrieval task.
Efficient and effective transformed image identification
- Authors: Awrangjeb, Mohammad , Lu, Guojun
- Date: 2008
- Type: Text , Conference proceedings
- Full Text: false
- Description: The SIFT (scale invariant feature transform) has demonstrated its superior performance in identifying transformed images over many other approaches. However, both of its detection and matching stages are expensive, because a large number of keypoints are detected in the scale-space and each keypoint is described using a 128-dimensional vector. We present two possible solutions for feature-point reduction. First is to down scale the image before the SIFT keypoint detection and second is to use corners (instead of SIFT keypoints) which are visually significant, more robust, and much smaller in number than the SIFT keypoints. Either the curvature descriptor or the highly distinctive SIFT descriptors at corner locations can be used to represent corners.We then describe a new feature-point matching technique, which can be used for matching both the down-scaled SIFT keypoints and corners. Experimental results show that two feature-point reduction solutions combined with the SIFT descriptors and the proposed feature-point matching technique not only improve the computational efficiency and decrease the storage requirement, but also improve the transformed image identification accuracy (robustness).
Reversible data hiding in encrypted images based on image partition and spatial correlation
- Authors: Song, Chang , Zhang, Yifeng , Lu, Guojun
- Date: 2019
- Type: Text , Conference proceedings , Conference paper
- Relation: 17th International Workshop on Digital Forensics and Watermarking, IWDW 2018; Jeju Island, South Korea; 22nd-24th October 2018; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 11378 LNCS, p. 180-194
- Full Text: false
- Reviewed:
- Description: Recently, more and more attention is paid to reversible data hiding (RDH) in encrypted images because of its better protection of privacy compared with traditional RDH methods directly operated in original images. In several RDH algorithms, prediction-error expansion (PEE) is proved to be superior to other methods in terms of embedding capacity and distortion of marked image and multiple histograms modification (MHM) can realize adaptive selection of expansion bins which depends on image content in the modification of a sequence of histograms. Therefore, in this paper, we propose an efficient RDH method in encrypted images by combining PEE and MHM, and design corresponding mode of image partition. We first divide the image into three parts: W (for embedding secret data), B (for embedding the least significant bit(LSB) of W) and G (for generating prediction-error histograms). Then, we apply PEE and MHM to embed the LSB of W to reserve space for secret data. Next, we encrypt the image and change the LSB of W to realize the embedding of secret data. In the process of extraction, the reversibility of image and secret data can be guaranteed. The utilization of correlation between neighbor pixels and embedded order decided by the smoothness of pixel in part W contribute to the performance of our method. Compared to the existing algorithms, experimental results show that the proposed method can reduce distortion to the image at given embedding capacity especially at low embedding capacity.
Automatic Extraction of Buildings in an Urban Region
- Authors: Siddiqui, Fasahat , Teng, Shyh , Lu, Guojun , Awrangjeb, Mohammad
- Date: 2014
- Type: Text , Conference proceedings
- Relation: 29th International Conference on Image and Vision Computing New Zealand, IVCNZ 2014; Hamilton; New Zealand; 19th-21st November 2014; published in ACM International Conference Proceeding Series p. 178-183
- Full Text:
- Reviewed:
- Description: There are currently several automatic building extraction methods introduced in the literature, but none of them are capable to completely extract portions of a building that are below a pre-defined building minimum height threshold. This paper proposes a systematic method which analyzes the height differences between the extracted adjacent planes above and below the height threshold as well as the planes' connectivity, thereby, extracting all portions belonging to buildings more completely. In general, the height difference between the edges of the adjacent planes above and below the height threshold that belong to the same building is more uniform. In addition, the extracted planes below the height threshold that belong to a building and their adjacent ground planes also have a clear height difference. The proposed method incorporates such information to achieve better performance in building extraction. We have compared our proposed method to a current state-of-the-art building extraction method qualitatively and quantitatively. Our experimental results show that our proposed method successfully recovers portions of a building below the height threshold, thereby achieving relatively higher average completeness (an improvement of 1.14%) and quality (an improvement of 0.93%).
Texture classification using multimodal invariant local binary pattern
- Authors: Sadat, Rafi , Teng, Shyh , Lu, Guojun , Hasan, Sheikh
- Date: 2011
- Type: Text , Conference paper
- Relation: IEEE Workshop on Applications of Computer Vision (WACV) p. 315-320
- Full Text: false
- Reviewed:
- Description: As texture information among pixels can be effectively represented using Local binary patterns (LBPs), image descriptors built using LBPs or its variants have been frequently used for various image analysis applications, e.g. medical image and texture image classification and retrieval. However, neither LBP nor any of its existing variants can be used to build descriptors for classifying multimodal images effectively. This is because the same object when captured in different modalities may result in opposite pixel intensity in some corresponding parts of the images, which in turn will cause their descriptors to be very different. To solve this problem, we propose a novel modality invariant texture descriptor which is built by modifying the standard procedure for building LBP. In this paper, we explain how the proposed descriptor can be built efficiently. We also demonstrate empirically that compared to all the state of the art LBP-based descriptors, the proposed descriptor achieves better accuracy for classifying multimodal images
Classifier-free extraction of power line wires from point cloud data
- Authors: Awrangjeb, Mohammad , Gao, Yongsheng , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings
- Relation: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018; Canberra, Australia; 10th-13th December 2018
- Full Text: false
- Reviewed:
- Description: This paper proposes a classifier-free method for extraction of power line wires from aerial point cloud data. It combines the advantages of both grid- and point-based processing of the input data. In addition to the non-ground point cloud data, the input to the proposed method includes the pylon locations, which are automatically extracted by a previous method. The proposed method first counts the number of wires in a span between the two successive pylons using two masks: vertical and horizontal. Then, the initial wire segments are obtained and refined iteratively. Finally, the initial segments are extended on both ends and each individual wire points are modelled as a 3D polynomial curve. Experimental results show both the object-based completeness and correctness are 97%, while the point-based completeness and correctness are 99% and 88%, respectively.
- Description: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
An enhancement to SIFT-based techniques for image registration
- Authors: Hossain, Tanvir , Teng, Shyh , Lu, Guojun , Lackmann, Martin
- Date: 2010
- Type: Text , Conference paper
- Relation: Proceedings of the 2010 Digital Image Computing: Techniques and Applications p. 166-171
- Full Text: false
- Reviewed:
- Description: Symmetric-SIFT is a recently proposed local technique used for registering multimodal images. It is based on a well-known general image registration technique named Scale Invariant Feature Transform (SIFT). Symmetric SIFT makes use of the gradient magnitude information at the image's key regions to build the descriptors. In this paper, we highlight an issue with how the magnitude information is used in this process. This issue may result in similar descriptors being built to represent regions in images that are visually different. To address this issue, we have proposed two new strategies for weighting the descriptors. Our experimental results show that Symmetric-SIFT descriptors built using our proposed strategies can lead to better registration accuracy than descriptors built using the original Symmetric-SIFT technique. The issue highlighted and the two strategies proposed are also applicable to the general SIFT technique.
Rotation invariant curvelet features for region based image retrieval
- Authors: Zhang, Dengsheng , Islam, Md , Lu, Guojun , Sumana, Ishrat
- Date: 2011
- Type: Text , Journal article
- Relation: International Journal of Computer Vision Vol. 98, no. 2 (2011), p. 187-201
- Full Text: false
- Reviewed:
- Description: There have been much interest and a large amount of research on content based image retrieval (CBIR) in recent years due to the ever increasing number of digital images. Texture features play a key role in CBIR. Many texture features exist in literature, however, most of them are neither rotation invariant nor robust to scale and other variations. Texture features based on Gabor filters have been shown with significant advantages over other methods, and they are adopted by MPEG-7 as one of the texture descriptors for image retrieval. In this paper, we propose a rotation invariant curvelet features for texture representation. With systematic analysis and rigorous experiments, we show that the proposed curvelet texture features significantly outperforms the widely used Gabor texture features. A novel region padding method is also proposed to apply curvelet transform to region based image retrieval. Retrieval results from standard image databases show that curvelet features are promising for both texture and region representation.