List of Titles

Anti-aliasing deep image classifiers using novel depth adaptive blurring and activation function

Authors: Hossain, Md Tahmid , Teng, Shyh , Lu, Guojun , Rahman, Mohammad Arifur , Sohel, Ferdous
Date: 2023
Type: Text , Journal article
Relation: Neurocomputing Vol. 536, no. (2023), p. 164-174
Full Text: false
Reviewed:
Description: Deep convolutional networks are vulnerable to image translation or shift, partly due to common down-sampling layers, e.g., max-pooling and strided convolution. These operations violate the Nyquist sampling rate and cause aliasing. The textbook solution is low-pass filtering (blurring) before down-sampling, which can benefit deep networks as well. Even so, non-linearity units, such as ReLU, often re-introduce the problem, suggesting that blurring alone may not suffice. In this work, first, we analyse deep features with Fourier transform and show that Depth Adaptive Blurring is more effective, as opposed to monotonic blurring. To this end, we propose a novel Depth Adaptive Blur-pool (DAB-pool) module to replace existing down-sampling methods. Second, we introduce a novel activation function – with a built-in low pass filter, as an additional measure, to keep the problem from reappearing. From experiments, we observe generalisation on other forms of transformations and corruptions as well, e.g., rotation, scale, and noise. We evaluate our method under three challenging settings: (1) a variety of image translations; (2) adversarial attacks – both

Bidirectional mapping coupled GAN for generalized zero-shot learning

Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
Date: 2022
Type: Text , Journal article
Relation: IEEE Transactions on Image Processing Vol. 31, no. (2022), p. 721-733
Full Text:
Reviewed:
Description: Bidirectional mapping-based generalized zero-shot learning (GZSL) methods rely on the quality of synthesized features to recognize seen and unseen data. Therefore, learning a joint distribution of seen-unseen classes and preserving the distinction between seen-unseen classes is crucial for GZSL methods. However, existing methods only learn the underlying distribution of seen data, although unseen class semantics are available in the GZSL problem setting. Most methods neglect retaining seen-unseen classes distinction and use the learned distribution to recognize seen and unseen data. Consequently, they do not perform well. In this work, we utilize the available unseen class semantics alongside seen class semantics and learn joint distribution through a strong visual-semantic coupling. We propose a bidirectional mapping coupled generative adversarial network (BMCoGAN) by extending the concept of the coupled generative adversarial network into a bidirectional mapping model. We further integrate a Wasserstein generative adversarial optimization to supervise the joint distribution learning. We design a loss optimization for retaining distinctive information of seen-unseen classes in the synthesized features and reducing bias towards seen classes, which pushes synthesized seen features towards real seen features and pulls synthesized unseen features away from real seen features. We evaluate BMCoGAN on benchmark datasets and demonstrate its superior performance against contemporary methods. © 1992-2012 IEEE.

Few-shot image classification : current status and research trends

Authors: Liu, Ying , Zhang, Hengchang , Zhang, Weidong , Lu, Guojun , Tian, Qi , Ling, Nam
Date: 2022
Type: Text , Journal article , Review
Relation: Electronics (Switzerland) Vol. 11, no. 11 (2022), p.
Full Text:
Reviewed:
Description: Conventional image classification methods usually require a large number of training samples for the training model. However, in practical scenarios, the amount of available sample data is often insufficient, which easily leads to overfitting in network construction. Few-shot learning provides an effective solution to this problem and has been a hot research topic. This paper provides an intensive survey on the state-of-the-art techniques in image classification based on few-shot learning. According to the different deep learning mechanisms, the existing algorithms are di-vided into four categories: transfer learning based, meta-learning based, data augmentation based, and multimodal based methods. Transfer learning based methods transfer useful prior knowledge from the source domain to the target domain. Meta-learning based methods employ past prior knowledge to guide the learning of new tasks. Data augmentation based methods expand the amount of sample data with auxiliary information. Multimodal based methods use the information of the auxiliary modal to facilitate the implementation of image classification tasks. This paper also summarizes the few-shot image datasets available in the literature, and experimental results tested by some representative algorithms are provided to compare their performance and analyze their pros and cons. In addition, the application of existing research outcomes on few-shot image classification in different practical fields are discussed. Finally, a few future research directions are iden-tified. © 2022 by the authors. Licensee MDPI, Basel, Switzerland.

Integrated generalized zero-shot learning for fine-grained classification

Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
Date: 2022
Type: Text , Journal article
Relation: Pattern Recognition Vol. 122, no. (2022), p.
Full Text:
Reviewed:
Description: Embedding learning (EL) and feature synthesizing (FS) are two of the popular categories of fine-grained GZSL methods. EL or FS using global features cannot discriminate fine details in the absence of local features. On the other hand, EL or FS methods exploiting local features either neglect direct attribute guidance or global information. Consequently, neither method performs well. In this paper, we propose to explore global and direct attribute-supervised local visual features for both EL and FS categories in an integrated manner for fine-grained GZSL. The proposed integrated network has an EL sub-network and a FS sub-network. Consequently, the proposed integrated network can be tested in two ways. We propose a novel two-step dense attention mechanism to discover attribute-guided local visual features. We introduce new mutual learning between the sub-networks to exploit mutually beneficial information for optimization. Moreover, we propose to compute source-target class similarity based on mutual information and transfer-learn the target classes to reduce bias towards the source domain during testing. We demonstrate that our proposed method outperforms contemporary methods on benchmark datasets. © 2021 Elsevier Ltd

A novel fusion approach in the extraction of kernel descriptor with improved effectiveness and efficiency

Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
Date: 2021
Type: Text , Journal article
Relation: Multimedia Tools and Applications Vol. 80, no. 10 (Apr 2021), p. 14545-14564
Full Text:
Reviewed:
Description: Image representation using feature descriptors is crucial. A number of histogram-based descriptors are widely used for this purpose. However, histogram-based descriptors have certain limitations and kernel descriptors (KDES) are proven to overcome them. Moreover, the combination of more than one KDES performs better than an individual KDES. Conventionally, KDES fusion is performed by concatenating them after the gradient, colour and shape descriptors have been extracted. This approach has limitations in regard to the efficiency as well as the effectiveness. In this paper, we propose a novel approach to fuse different image features before the descriptor extraction, resulting in a compact descriptor which is efficient and effective. In addition, we have investigated the effect on the proposed descriptor when texture-based features are fused along with the conventionally used features. Our proposed descriptor is examined on two publicly available image databases and shown to provide outstanding performances.

Adversarial network with multiple classifiers for open set domain adaptation

Authors: Shermin, Tasfia , Lu, Guojun , Teng, Shyh , Murshed, Manzur , Sohel, Ferdous
Date: 2021
Type: Text , Journal article
Relation: IEEE Transactions on Multimedia Vol. 23, no. (2021), p. 2732-2744
Full Text:
Reviewed:
Description: Domain adaptation aims to transfer knowledge from a domain with adequate labeled samples to a domain with scarce labeled samples. Prior research has introduced various open set domain adaptation settings in the literature to extend the applications of domain adaptation methods in real-world scenarios. This paper focuses on the type of open set domain adaptation setting where the target domain has both private ('unknown classes') label space and the shared ('known classes') label space. However, the source domain only has the 'known classes' label space. Prevalent distribution-matching domain adaptation methods are inadequate in such a setting that demands adaptation from a smaller source domain to a larger and diverse target domain with more classes. For addressing this specific open set domain adaptation setting, prior research introduces a domain adversarial model that uses a fixed threshold for distinguishing known from unknown target samples and lacks at handling negative transfers. We extend their adversarial model and propose a novel adversarial domain adaptation model with multiple auxiliary classifiers. The proposed multi-classifier structure introduces a weighting module that evaluates distinctive domain characteristics for assigning the target samples with weights which are more representative to whether they are likely to belong to the known and unknown classes to encourage positive transfers during adversarial training and simultaneously reduces the domain gap between the shared classes of the source and target domains. A thorough experimental investigation shows that our proposed method outperforms existing domain adaptation methods on a number of domain adaptation datasets. © 1999-2012 IEEE.

Detection of Malleefowl Mounds from Point Cloud Data

Authors: Parvin, Nahida , Awrangjeb, Mohammad , Irvin, Marc , Florentine, Singarayer , Murshed, Manzur , Lu, Guojun
Date: 2021
Type: Text , Conference paper
Relation: 2021 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2021, Gold Coast, 29 November to 1 December 2021
Full Text: false
Reviewed:
Description: Airborne light detection and ranging (LiDAR) data have become cost and time-efficient means for estimating the size of timid fauna populations through the identification of artefacts that evidence their occurrence in a large, hostile geographic area. The unobtrusive detection method helps conservation managers to assess the stability of a population and to design appropriate conservation programs. Here we propose a mound (nest) detection method for Australia's native iconic bird, the Malleefowl, from point cloud data, which can be manipulated to act as a surrogate for population data. Existing detection methods are largely through manual observations, and are therefore not efficient for covering large and remote areas. The proposed mound detection method can identify mound feature based on height and intensity values provided by the point cloud data. Each candidate mound point is initially selected by applying a height threshold utilising the classified ground points and their corresponding digital elevation model (DEM). Then, another threshold based on intensity range derived from ground truth mound area analysis is applied on the extracted initial mound points to find the final candidate mound points. These extracted points are then used to generate a binary mask where the potential mound points are found sparse. To connect those points, a morphological filter is applied on the binary image and found the mound separated from other remaining non-mound objects. To obtain the mound from other non-mound objects, a morphological cleaning operation and a connected component analysis are carried out on the mask. The non-mound objects are removed from the mask utilising the area property of mound derived from the empirical analysis of ground-truth observations. Finally, the effectiveness of the proposed technique is calculated based on ground truth. Although the mound shapes and structures are highly variable in nature, our height and intensity-based mound point extraction method detected 55 % of the ground-truthed mounds. © 2021 IEEE.

Integrating line weber local descriptor and deep feature for tire indentation mark image classification

Authors: Liu, Ying , Che, Xin , Dong, Haitao , Li, Daxiang , Teng, Shyh , Lu, Guojun
Date: 2021
Type: Text , Conference paper
Relation: 4th International Conference on Artificial Intelligence and Pattern Recognition, 4th International Conference on Artificial Intelligence and Pattern Recognition, AIPR 2021,Virtual, Online,17-19 September 2021, 2021, ACM International Conference Proceeding Series p. 56-61
Full Text: false
Reviewed:
Description: Tire indentation mark matching is an essential tool used for the investigation of criminal cases and traffic incidents. As such images are unique and uncommon, there is a lack of dedicated databases and relevant research on this topic. This paper presents a feature extraction algorithm effective for tire indentation mark image description. The main contributions include: (1) Line feature Weber local descriptor (LWLD) is proposed, which uses the Gabor orientations instead of the original gradient orientation. This feature can describe texture information of tire indentation mark image more efficiently. (2) An attention model is constructed to produce attention feature map of tire indentation mark image. This attention feature map is then fused with LWLD resulting in a feature with more powerful representation capability. Experimental results prove that the combined use of LWLD and attention model greatly enhances the performance of tire indentation mark image matching tasks. © 2021 ACM.

Online dual dictionary learning for visual object tracking

Authors: Cheng, Xu , Zhang, Yifeng , Zhou, Lin , Lu, Guojun
Date: 2021
Type: Text , Journal article
Relation: Journal of Ambient Intelligence and Humanized Computing Vol. 12, no. 12 (2021), p. 10881-10896
Full Text: false
Reviewed:
Description: Sparse representation method has been widely applied to visual tracking. Most of existing tracking algorithms based on sparse representation exploit the l0 or l1-norm for solving the sparse coefficients. However, it makes the execution of solution very time consuming. In this paper, we propose an effective dual dictionary learning model for visual tracking. The dictionary model is composed of discriminative dictionary and analytic dictionary; they work together to perform the representation and discrimination simultaneously. First, we exploit the object states of the first ten frames of a video to initialize the dual dictionary. In the tracking phase, the dual dictionary model is updated alternatively. Second, the local and global information of the object are integrated into the dual dictionary learning model. Sparse coefficients of the patch are used to encode the local structural information of the object. Furthermore, all the sparse coefficients within one object state form a global object representation. We develop a likelihood function that takes an adaptive threshold into consideration to de-noise the global representation. In addition, the object template is updated via an online scheme to adapt the object appearance changes. The experiments on a number of common benchmark test sets show that our approach is more effective than the existing methods. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature.

Robust image classification using a low-pass activation function and DCT augmentation

Authors: Hossain, Md Tahmid , Teng, Shyh , Sohel, Ferdous , Lu, Guojun
Date: 2021
Type: Text , Journal article
Relation: IEEE Access Vol. 9, no. (2021), p. 86460-86474
Full Text:
Reviewed:
Description: Convolutional Neural Network's (CNN's) performance disparity on clean and corrupted datasets has recently come under scrutiny. In this work, we analyse common corruptions in the frequency domain, i.e., High Frequency corruptions (HFc, e.g., noise) and Low Frequency corruptions (LFc, e.g., blur). Although a simple solution to HFc is low-pass filtering, ReLU - a widely used Activation Function (AF), does not have any filtering mechanism. In this work, we instill low-pass filtering into the AF (LP-ReLU) to improve robustness against HFc. To deal with LFc, we complement LP-ReLU with Discrete Cosine Transform based augmentation. LP-ReLU, coupled with DCT augmentation, enables a deep network to tackle the entire spectrum of corruption. We use CIFAR-10-C and Tiny ImageNet-C for evaluation and demonstrate improvements of 5% and 7.3% in accuracy respectively, compared to the State-Of-The-Art (SOTA). We further evaluate our method's stability on a variety of perturbations in CIFAR-10-P and Tiny ImageNet-P, achieving new SOTA in these experiments as well. To further strengthen our understanding regarding CNN's lack of robustness, a decision space visualisation process is proposed and presented in this work. © 2013 IEEE.

Siamese network for object tracking with multi-granularity appearance representations

Authors: Zhang, Zhuoyi , Zhang, Yifeng , Cheng, Xu , Lu, Guojun
Date: 2021
Type: Text , Journal article
Relation: Pattern Recognition Vol. 118, no. (2021), p.
Full Text: false
Reviewed:
Description: A reliable tracker has the ability to adapt to change of objects over time, and is robust and accurate. We build such a tracker by extracting semantic features using robust Siamese networks and multi-granularity color features. It incorporates a semantic model that can capture high quality semantic features and an appearance model that can describe object at pixel, local and global levels effectively. Furthermore, we propose a novel selective traverse algorithm to allocate weights to semantic models and appearance models dynamically for better tracking performance. During tracking, our tracker updates appearance representations for objects based on the recent tracking results. The proposed tracker operates at speeds that exceed the real-time requirement, and outperforms nearly all other state-of-the-art trackers on OTB-2013/2015 and VOT-2016/2017 benchmarks. © 2021 Elsevier Ltd

A pupil-positioning method based on the starburst model

Authors: Yu, Pingping , Duan, Wenjie , Sun, Yi , Cao, Ning , Wang, Zhenzhou , Lu, Guojun
Date: 2020
Type: Text , Journal article
Relation: Cmc-Computers Materials & Continua Vol. 64, no. 2 (2020), p. 1199-1217
Full Text:
Reviewed:
Description: Human eye detection has become an area of interest in the field of computer vision with an extensive range of applications in human-computer interaction, disease diagnosis, and psychological and physiological studies. Gaze-tracking systems are an important research topic in the human-computer interaction field. As one of the core modules of the head-mounted gaze-tracking system, pupil positioning affects the accuracy and stability of the system. By tracking eye movements to better locate the center of the pupil, this paper proposes a method for pupil positioning based on the starburst model. The method uses vertical and horizontal coordinate integral projections in the rectangular region of the human eye for accurate positioning and applies a linear interpolation method that is based on a circular model to the reflections in the human eye. In this paper, we propose a method for detecting the feature points of the pupil edge based on the starburst model, which clusters feature points and uses the RANdom SAmple Consensus (RANSAC) algorithm to perform ellipse fitting of the pupil edge to accurately locate the pupil center. Our experimental results show that the algorithm has higher precision, higher efficiency and more robustness than other algorithms and excellent accuracy even when the image of the pupil is incomplete.
Description: Science and Technology Support Plan Project of Hebei Province (grant numbers 17210803D and 19273703D Science and Technology Spark Project of the Hebei Seismological Bureau (grant number DZ20180402056) Education Department of Hebei Province (grant number QN2018095) Polytechnic College of Hebei University of Science and Technology

An enhancement to the spatial pyramid matching for image classification and retrieval

Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
Date: 2020
Type: Text , Journal article
Relation: IEEE Access Vol. 8, no. (2020), p. 22463-22472
Full Text:
Reviewed:
Description: Spatial pyramid matching (SPM) is one of the widely used methods to incorporate spatial information into the image representation. Despite its effectiveness, the traditional SPM is not rotation invariant. A rotation invariant SPM has been proposed in the literature but it has many limitations regarding the effectiveness. In this paper, we investigate how to make SPM robust to rotation by addressing those limitations. In an SPM framework, an image is divided into an increasing number of partitions at different pyramid levels. In this paper, our main focus is on how to partition images in such a way that the resulting structure can deal with image-level rotations. To do that, we investigate three concentric ring partitioning schemes. Apart from image partitioning, another important component of the SPM framework is a weight function. To apportion the contribution of each pyramid level to the final matching between two images, the weight function is needed. In this paper, we propose a new weight function which is suitable for the rotation-invariant SPM structure. Experiments based on image classification and retrieval are performed on five image databases. The detailed result analysis shows that we are successful in enhancing the effectiveness of SPM for image classification and retrieval. © 2013 IEEE.

A Rotation invariant HOG descriptor for tire pattern image classification

Authors: Liu, Ying , Ge, Yuxiang , Wang, Fuping , Liu, Qiqi , Lei, Yanbo , Zhang, Dengsheng , Lu, Guojun
Date: 2019
Type: Text , Conference proceedings
Relation: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); Brighton, UK, 12-17 May 2019. p. 2412-2416
Full Text: false
Reviewed:
Description: Texture feature is important in describing tire pattern image which provides useful clue in solving crime cases and traffic accidents. In this paper, we propose a novel texture feature extraction method based on HOG (Histogram of Oriented Gradient) and dominant gradient (DG) in tire pattern images, named HOG-DG. The proposed HOG-DG is not only robust to illumination and scale changes but also is rotation-invariant. In the proposed HOG-DG, HOG features are first computed from circular local cells, and HOG features from an image are concatenated and normalized using the DG to construct the HOG-DG feature. HOG-DG is used to train a support-vector-machine (SVM) classifier for tire pattern classification. Experimental results demonstrate its outstanding performance for tire pattern description.

BackNet: An Enhanced backbone network for accurate detection of objects with large scale variations

Authors: Hossain, Md Tahmid , Teng, Shyh , Lu, Guojun
Date: 2019
Type: Text , Book chapter
Relation: Image and Video Technology. PSIVT 2019 p. 52-64
Full Text: false
Reviewed:
Description: Deep Convolutional Neural Networks (CNNs) have induced significant progress in the field of computer vision including object detection and classification. Two-stage detectors like Faster RCNN and its variants are found to be more accurate compared to its one-stage counter-parts. Faster RCNN combines an ImageNet pretrained backbone network (e.g VGG16) and a Region Proposal Network (RPN) for object detection. Although Faster RCNN performs well on medium and large scale objects, detecting smaller ones with high accuracy while maintaining stable performance on larger objects still remains a challenging task. In this work, we focus on designing a robust backbone network for Faster RCNN that is capable of detecting objects with large variations in scale. Considering the difficulties posed by small objects, our aim is to design a backbone network that allows signals extracted from small objects to be propagated right through to the deepest layers of the network. This being our motivation, we propose a robust network: BackNet, which can be integrated as a backbone into any two-stage detector. We evaluate the performance of BackNet-Faster RCNN on MS COCO dataset and show that the proposed method outperforms five contemporary methods.

Distortion robust image classification using deep convolutional neural network with discrete cosine transform

Authors: Hossain, Md Tahmid , Teng, Shyh , Zhang, Dengsheng , Lim, Suryani , Lu, Guojun
Date: 2019
Type: Text , Conference proceedings
Relation: 2019 IEEE International Conference on Image Processing (ICIP);Taipei, Taiwan; 22-25 Sept, 2019 p. 659-663
Full Text: false
Reviewed:
Description: Convolutional Neural Networks are highly effective for image classification. However, it is still vulnerable to image distortion. Even a small amount of noise or blur can severely hamper the performance of these CNNs. Most work in the literature strives to mitigate this problem simply by fine-tuning a pre-trained CNN on mutually exclusive or a union set of distorted training data. This iterative fine-tuning process with all known types of distortion is exhaustive and the network struggles to handle unseen distortions. In this work, we propose distortion robust DCT-Net, a Discrete Cosine Transform based module integrated into a deep network which is built on top of VGG16 [1]. Unlike other works in the literature, DCT-Net is "blind" to the distortion type and level in an image both during training and testing. The DCT-Net is trained only once and applied in a more generic situation without further retraining. We also extend the idea of dropout and present a training adaptive version of the same. We evaluate our proposed DCT-Net on a number of benchmark datasets. Our experimental results show that once trained, DCT-Net not only generalizes well to a variety of unseen distortions but also outperforms other comparable networks in the literature.

Enhanced transfer learning with ImageNet trained classification layer

Authors: Shermin, Tasfia , Teng, Shyh , Murshed, Manzur , Lu, Guojun , Sohel, Ferdous , Paul, Manoranjan
Date: 2019
Type: Text , Book chapter
Relation: Image and Video Technology Chapter 12 p. 142-1455
Full Text: false
Reviewed:
Description: Parameter fine tuning is a transfer learning approach whereby learned parameters from pre-trained source network are transferred to the target network followed by fine-tuning. Prior research has shown that this approach is capable of improving task performance. However, the impact of the ImageNet pre-trained classification layer in parameter fine-tuning is mostly unexplored in the literature. In this paper, we propose a fine-tuning approach with the pre-trained classification layer. We employ layer-wise fine-tuning to determine which layers should be frozen for optimal performance. Our empirical analysis demonstrates that the proposed fine-tuning performs better than traditional fine-tuning. This finding indicates that the pre-trained classification layer holds less category-specific or more global information than believed earlier. Thus, we hypothesize that the presence of this layer is crucial for growing network depth to adapt better to a new task. Our study manifests that careful normalization and scaling are essential for creating harmony between the pre-trained and new layers for target domain adaptation. We evaluate the proposed depth augmented networks for fine-tuning on several challenging benchmark datasets and show that they can achieve higher classification accuracy than contemporary transfer learning approaches.

Reversible data hiding in encrypted images based on image partition and spatial correlation

Authors: Song, Chang , Zhang, Yifeng , Lu, Guojun
Date: 2019
Type: Text , Conference proceedings , Conference paper
Relation: 17th International Workshop on Digital Forensics and Watermarking, IWDW 2018; Jeju Island, South Korea; 22nd-24th October 2018; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 11378 LNCS, p. 180-194
Full Text: false
Reviewed:
Description: Recently, more and more attention is paid to reversible data hiding (RDH) in encrypted images because of its better protection of privacy compared with traditional RDH methods directly operated in original images. In several RDH algorithms, prediction-error expansion (PEE) is proved to be superior to other methods in terms of embedding capacity and distortion of marked image and multiple histograms modification (MHM) can realize adaptive selection of expansion bins which depends on image content in the modification of a sequence of histograms. Therefore, in this paper, we propose an efficient RDH method in encrypted images by combining PEE and MHM, and design corresponding mode of image partition. We first divide the image into three parts: W (for embedding secret data), B (for embedding the least significant bit(LSB) of W) and G (for generating prediction-error histograms). Then, we apply PEE and MHM to embed the LSB of W to reserve space for secret data. Next, we encrypt the image and change the LSB of W to realize the embedding of secret data. In the process of extraction, the reversibility of image and secret data can be guaranteed. The utilization of correlation between neighbor pixels and embedded order decided by the smoothness of pixel in part W contribute to the performance of our method. Compared to the existing algorithms, experimental results show that the proposed method can reduce distortion to the image at given embedding capacity especially at low embedding capacity.

Voxel-based extraction of individual pylons and wires from lidar point cloud data

Authors: Munir, Nosheen , Awrangjeb, Mohammad , Stantic, Bela , Lu, Guojun , Islam, Syed
Date: 2019
Type: Text , Journal article
Relation: ISPRS annals of the photogrammetry, remote sensing and spatial information sciences Vol. IV-4/W8, no. (2019), p. 91-98
Full Text:
Reviewed:
Description: Extraction of individual pylons and wires is important for modelling of 3D objects in a power line corridor (PLC) map. However, the existing methods mostly classify points into distinct classes like pylons and wires, but hardly into individual pylons or wires. The proposed method extracts standalone pylons, vegetation and wires from LiDAR data. The extraction of individual objects is needed for a detailed PLC mapping. The proposed approach starts off with the separation of ground and non ground points. The non-ground points are then classified into vertical (e.g., pylons and vegetation) and non-vertical (e.g., wires) object points using the vertical profile feature (VPF) through the binary support vector machine (SVM) classifier. Individual pylons and vegetation are then separated using their shape and area properties. The locations of pylons are further used to extract the span points between two successive pylons. Finally, span points are voxelised and alignment properties of wires in the voxel grid is used to extract individual wires points. The results are evaluated on dataset which has multiple spans with bundled wires in each span. The evaluation results show that the proposed method and features are very effective for extraction of individual wires, pylons and vegetation with 99% correctness and 98% completeness.

A detector of structural similarity for multi-modal microscopic image registration

Authors: Lv, Guohua , Teng, Shyh , Lu, Guojun
Date: 2018
Type: Text , Journal article
Relation: Multimedia Tools and Applications Vol. 77, no. 6 (2018), p. 7675-7701
Full Text: false
Reviewed:
Description: This paper presents a Detector of Structural Similarity (DSS) to minimize the visual differences between brightfield and confocal microscopic images. The context of this work is that it is very challenging to effectively register such images due to a low structural similarity in image contents. To address this issue, DSS aims to maximize the structural similarity by utilizing the intensity relationships among red-green-blue (RGB) channels in images. Technically, DSS can be combined with any multi-modal image registration technique in registering brightfield and confocal microscopic images. Our experimental results show that DSS significantly increases the visual similarity in such images, thereby improving the registration performance of an existing state-of-the-art multi-modal image registration technique by up to approximately 27%. © 2017, Springer Science+Business Media New York.

Showing items 1 - 20 of 110

Bidirectional mapping coupled GAN for generalized zero-shot learning

Few-shot image classification : current status and research trends

Integrated generalized zero-shot learning for fine-grained classification

A novel fusion approach in the extraction of kernel descriptor with improved effectiveness and efficiency

Adversarial network with multiple classifiers for open set domain adaptation

Robust image classification using a low-pass activation function and DCT augmentation

A pupil-positioning method based on the starburst model

An enhancement to the spatial pyramid matching for image classification and retrieval

Voxel-based extraction of individual pylons and wires from lidar point cloud data