A survey on image classification of lightweight convolutional neural network
- Authors: Liu, Ying , Xiao, Peng , Fang, Jie , Zhang, Dengsheng
- Date: 2023
- Type: Text , Conference paper
- Relation: 19th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, ICNC-FSKD 2023, Harbin, China, 29-31 July 2023, 2023 19th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)
- Full Text: false
- Reviewed:
- Description: In recent years, deep neural networks have achieved tremendous success in image classification in both academic and industrial settings. However, the high hardware requirements imposed by their intensive and complex computations pose a challenge for deployment on low-storage devices. To address this challenge, lightweight networks provide a viable solution. This paper provides a detailed review of recent lightweight image classification algorithms, which can be categorized into low-redundancy network model design and neural network compression algorithms. The former reduces network computations by replacing traditional convolution with efficient lightweight convolution, while the latter reduces redundancy in the network by employing methods such as network pruning, knowledge distillation, and parameter quantization. We summarize the experimental results of some classical models and algorithms on ImageNet2012 and CIFAR-10 datasets, and analyze the characteristics, advantages and disadvantages of these models respectively. Finally, future research directions for lightweight algorithms in the field of image classification are identified. © 2023 IEEE.
Fine-grained image classification based on knowledge distillation
- Authors: Liu, Ying , Feng, Hao , Zhang, Weidong , Fang, Jie , Xiao, Peng , Zhang, Dengsheng
- Date: 2023
- Type: Text , Conference paper
- Relation: 19th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, ICNC-FSKD 2023, Harbin, China, 29-31 July 2023, 2023 19th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)
- Full Text: false
- Reviewed:
- Description: Despite the outstanding performance of deep learning-based fine-grained image classification methods, the commonly used models still suffer from high cost of computation and memory Therefore, this paper proposes a mobile-based CNN network that focuses on discriminative features of fine-grained images by embedding a hybrid-domain attention module to achieve higher accuracy in recognition. Specifically, under the premise of reducing network parameters, this paper presents a classification method that combines transfer learning and knowledge distillation to enhance the model's generalization performance and resistance to overfitting. Different knowledge transfer strategies are validated through the experiments in the knowledge distillation process. Mobile models such as SqueezeNet, MobileNetV2, and CBAM MobileNetV2 all demonstrate enhanced performance the knowledge distillation optimization. The proposed method in this paper can be used to develop a lightweight mobile-based CNN model with comparable performance to complex models making it more advantageous in real-life scenarios with limited storage resources and low hardware computation levels. Additionally, the model compression process utilizes only the intermediate features of the original dataset, meeting the confidentiality requirements of the original data in the field of public security. © 2023 IEEE.
A novel fusion approach in the extraction of kernel descriptor with improved effectiveness and efficiency
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2021
- Type: Text , Journal article
- Relation: Multimedia Tools and Applications Vol. 80, no. 10 (Apr 2021), p. 14545-14564
- Full Text:
- Reviewed:
- Description: Image representation using feature descriptors is crucial. A number of histogram-based descriptors are widely used for this purpose. However, histogram-based descriptors have certain limitations and kernel descriptors (KDES) are proven to overcome them. Moreover, the combination of more than one KDES performs better than an individual KDES. Conventionally, KDES fusion is performed by concatenating them after the gradient, colour and shape descriptors have been extracted. This approach has limitations in regard to the efficiency as well as the effectiveness. In this paper, we propose a novel approach to fuse different image features before the descriptor extraction, resulting in a compact descriptor which is efficient and effective. In addition, we have investigated the effect on the proposed descriptor when texture-based features are fused along with the conventionally used features. Our proposed descriptor is examined on two publicly available image databases and shown to provide outstanding performances.
An enhancement to the spatial pyramid matching for image classification and retrieval
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2020
- Type: Text , Journal article
- Relation: IEEE Access Vol. 8, no. (2020), p. 22463-22472
- Full Text:
- Reviewed:
- Description: Spatial pyramid matching (SPM) is one of the widely used methods to incorporate spatial information into the image representation. Despite its effectiveness, the traditional SPM is not rotation invariant. A rotation invariant SPM has been proposed in the literature but it has many limitations regarding the effectiveness. In this paper, we investigate how to make SPM robust to rotation by addressing those limitations. In an SPM framework, an image is divided into an increasing number of partitions at different pyramid levels. In this paper, our main focus is on how to partition images in such a way that the resulting structure can deal with image-level rotations. To do that, we investigate three concentric ring partitioning schemes. Apart from image partitioning, another important component of the SPM framework is a weight function. To apportion the contribution of each pyramid level to the final matching between two images, the weight function is needed. In this paper, we propose a new weight function which is suitable for the rotation-invariant SPM structure. Experiments based on image classification and retrieval are performed on five image databases. The detailed result analysis shows that we are successful in enhancing the effectiveness of SPM for image classification and retrieval. © 2013 IEEE.
A Rotation invariant HOG descriptor for tire pattern image classification
- Authors: Liu, Ying , Ge, Yuxiang , Wang, Fuping , Liu, Qiqi , Lei, Yanbo , Zhang, Dengsheng , Lu, Guojun
- Date: 2019
- Type: Text , Conference proceedings
- Relation: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); Brighton, UK, 12-17 May 2019. p. 2412-2416
- Full Text: false
- Reviewed:
- Description: Texture feature is important in describing tire pattern image which provides useful clue in solving crime cases and traffic accidents. In this paper, we propose a novel texture feature extraction method based on HOG (Histogram of Oriented Gradient) and dominant gradient (DG) in tire pattern images, named HOG-DG. The proposed HOG-DG is not only robust to illumination and scale changes but also is rotation-invariant. In the proposed HOG-DG, HOG features are first computed from circular local cells, and HOG features from an image are concatenated and normalized using the DG to construct the HOG-DG feature. HOG-DG is used to train a support-vector-machine (SVM) classifier for tire pattern classification. Experimental results demonstrate its outstanding performance for tire pattern description.
Distortion robust image classification using deep convolutional neural network with discrete cosine transform
- Authors: Hossain, Md Tahmid , Teng, Shyh Wei , Zhang, Dengsheng , Lim, Suryani , Lu, Guojun
- Date: 2019
- Type: Text , Conference proceedings
- Relation: 2019 IEEE International Conference on Image Processing (ICIP);Taipei, Taiwan; 22-25 Sept, 2019 p. 659-663
- Full Text: false
- Reviewed:
- Description: Convolutional Neural Networks are highly effective for image classification. However, it is still vulnerable to image distortion. Even a small amount of noise or blur can severely hamper the performance of these CNNs. Most work in the literature strives to mitigate this problem simply by fine-tuning a pre-trained CNN on mutually exclusive or a union set of distorted training data. This iterative fine-tuning process with all known types of distortion is exhaustive and the network struggles to handle unseen distortions. In this work, we propose distortion robust DCT-Net, a Discrete Cosine Transform based module integrated into a deep network which is built on top of VGG16 [1]. Unlike other works in the literature, DCT-Net is "blind" to the distortion type and level in an image both during training and testing. The DCT-Net is trained only once and applied in a more generic situation without further retraining. We also extend the idea of dropout and present a training adaptive version of the same. We evaluate our proposed DCT-Net on a number of benchmark datasets. Our experimental results show that once trained, DCT-Net not only generalizes well to a variety of unseen distortions but also outperforms other comparable networks in the literature.
A kernel-based approach for content-based image retrieval
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand; Auckland, New Zealand; 19th-21st November 2018 p. 1-6
- Full Text: false
- Reviewed:
- Description: Content-based image retrieval (CBIR) is a popular approach to retrieve images based on a query. In CBIR, retrieval is executed based on the properties of image contents (e.g. gradient, shape, color, texture) which are generally encoded into image descriptors. Among the various image descriptors, histogram-based descriptors are very popular. However, they suffer from the limitation of coarse quantization. In contrast, the use of kernel descriptors (KDES) is proven to be more effective than histogram-based descriptors in other applications, e.g. image classification. This is because, in the KDES framework, instead of the quantization of pixel attributes, each pixel equally takes part in the similarity measurement between two images. In this paper, we propose an approach for how the conventional KDES and its improved version can be used for CBIR. In addition, we have provided a detailed insight into the effectiveness of improved kernel descriptors. Finally, our experiment results will show that kernel descriptors are significantly more effective than histogram-based descriptors in CBIR.
A new image dissimilarity measure incorporating human perception
- Authors: Shojanazeri, Hamid , Teng, Shyh , Aryal, Sunil , Zhang, Dengsheng , Lu, Guojun
- Date: 2018
- Type: Text , Unpublished work
- Full Text:
- Description: Pairwise (dis) similarity measure of data objects is central to many applications of image anlaytics, such as image retrieval and classification. Geometric distance, particularly Euclidean distance ((
A novel perceptual dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Zhang, Dengsheng , Teng, Shyh , Aryal, Sunil , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 Vol. 2018-November, p. 1-6
- Full Text: false
- Reviewed:
- Description: Similarity measure is an important research topic in image classification and retrieval. Given a type of image features, a good similarity measure should be able to retrieve similar images from the database while discard irrelevant images from the retrieval. Similarity measures in literature are typically distance based which measure the spatial distance between two feature vectors in high dimensional feature space. However, this type of similarity measures do not have any perceptual meaning and ignore the neighborhood influence in the similarity decision making process. In this paper, we propose a novel dissimilarity measure, which can measure both the distance and perceptual similarity of two image features in feature space. Results show the proposed similarity measure has a significant improvement over the traditional distance based similarity measure commonly used in literature.
- Description: International Conference Image and Vision Computing New Zealand
Enhancing the effectiveness of local descriptor based image matching
- Authors: Hossain, Md Tahmid , Teng, Shyh , Zhang, Dengsheng , Lim, Suryani , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018; Canberra, Australia; 10th-13th December 2018 p. 1-8
- Full Text: false
- Reviewed:
- Description: Image registration has received great attention from researchers over the last few decades. SIFT (Scale Invariant Feature Transform), a local descriptor-based technique is widely used for registering and matching images. To establish correspondences between images, SIFT uses a Euclidean Distance ratio metric. However, this approach leads to a lot of incorrect matches and eliminating these inaccurate matches has been a challenge. Various methods have been proposed attempting to mitigate this problem. In this paper, we propose a scale and orientation harmony-based pruning method that improves image matching process by successfully eliminating incorrect SIFT descriptor matches. Moreover, our technique can predict the image transformation parameters based on a novel adaptive clustering method with much higher matching accuracy. Our experimental results have shown that the proposed method has achieved averages of approximately 16% and 10% higher matching accuracy compared to the traditional SIFT and a contemporary method respectively.
- Description: 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
Image clustering using a similarity measure incorporating human perception
- Authors: Shojanazeri, Hamid , Aryal, Sunil , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 p. 1-6
- Full Text: false
- Reviewed:
- Description: Clustering similar images is an important task in image processing and computer vision. It requires a measure to quantify pairwise similarities of images. The performance of clustering algorithm depends on the choice of similarity measure. In this paper, we investigate the effectiveness of data independent (distance-based), data-dependent (mass-based) and hybrid (dis)similarity measures in the image clustering task using three benchmark image collections with different sets of features. Our results of K-Medoids clustering show that uses the hybrid Perceptual Dissimilarity Measure (PMD) produces better clustering results than distance-based l(p) - norm and mass-based m(p) - dissimilarity.
A hybrid data dependent dissimilarity measure for image retrieval
- Authors: Shojanazeri, Hamid , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing - Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 141-148
- Full Text: false
- Reviewed:
- Description: In image retrieval, an effective dissimilarity (or similarity) measure is required to retrieve the perceptually similar images. Minkowski-type distance is widely used for image retrieval, however it has its limitation. It focuses on distance between image features and ignores the data distribution of the image features, which can play an important role in measuring perceptual similarity of images. To address this limitation, a data dependent measure named m-p, which calculates the dissimilarity using the data distribution rather than geometric distance has been proposed recently. It considers two instances in a sparse region to be more similar than in a dense region. Relying only on data distribution and completely ignoring the geometric distance raise other limitations. This may result in finding two perceptually dissimilar instances similar due to being located in a sparse region or vice versa. We proposed a new hybrid dissimilarity measure and experimental results show that it addresses these limitations.
Improved kernel descriptors for effective and efficient image classification
- Authors: Karmakar, Priyabrata , Teng, Shyh , Zhang, Dengsheng , Liu, Ying , Lu, Guojun
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 195-202
- Full Text: false
- Reviewed:
- Description: Kernel descriptors have been proven to outperform existing histogram based local descriptors as such descriptors are extracted from the match kernels which measure similarities between image patches using different pixel attributes (gradient, colour or LBP pattern). The extraction of kernel descriptors does not require coarse quantization of pixel attributes. Instead, each pixel equally participates in matching between two image patches. In this paper, by leveraging the kernel properties, we propose a unique approach which simultaneously increases the effectiveness and efficiency of the existing kernel descriptors. Specifically, this is done by improving the similarity measure between two different patches in terms of any pixel attribute. The proposed kernel descriptors are more discriminant, take less time to be extracted and have much lower dimensions. Our experiments on Scene Categories and Caltech 101 databases show that our proposed approach outperforms the existing kernel descriptors.
Improved Tamura features for image classification using kernel based descriptors
- Authors: Karmakar, Priyabrata , Teng, Shyh , Zhang, Dengsheng , Liu, Ying , Lu, Guojun
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 461-467
- Full Text: false
- Reviewed:
- Description: Tamura features are based on human visual perception and have huge potential in image representation. Conventional Tamura features only work on homogeneous texture images and perform poor on generic images. Therefore, many researchers attempt to improve Tamura features and most of the improvements are based on histogram based representation. Kernel descriptors have been shown to outperform existing histogram based local features as such descriptors do not require coarse quantization of pixel attributes. Instead, in kernel descriptor framework, each pixel equally participates in matching between two image patches. In this paper, we propose a set of kernel descriptors that are based on Tamura features. Additionally, the proposed descriptors are invariant to local rotations. Experimental results show that our proposed approach outperforms the conventional Tamura features significantly.
Integrating object ontology and region semantic template for crime scene investigation image retrieval
- Authors: Liu, Ying , Huang, Yuan , Zhang, Shuai , Zhang, Dengsheng , Ling, Nam
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA); Siem Reap, Cambodia; 18th-20th June 2017 p. 149-153
- Full Text: false
- Reviewed:
- Description: Crime Scene Investigation (CSI) image retrieval plays an important role in solving crimes by providing useful clues for the police force. However, there has been little work done in this area due to limited public data access by researchers. Tested on real-world CSI images, it was observed that existing content-based image retrieval (CBIR) methods do not necessarily retrieve as effectively on CSI image database as compared to other general image databases. Hence, it is important to design CBIR algorithm tuned to CSI image database. This paper proposes a region-based semantic learning method based on object ontology which associates image categories with 'objects' in CSI images. Each object corresponds to a pre-defined semantic template (ST) which is defined as the average color and texture feature of a set of sample regions. In this way, low-level features of each region in a CSI image can be converted to an 'object' by comparing the region features with the set of pre-defined STs. The 'objects' in an image categorize the image based on the object ontology. The above process is referred to as 'On-Set'. To further improve retrieval performance of On-Set, a weighting strategy named object-frequency-based weighting (OFW) is designed inspired by the idea of term frequency-inverse document frequency (TF-IDF). In OFW, heavier weight is assigned to regions that appear more often in one class and less often in other classes. Experimental results on real-world image data proved the effectiveness of the proposed method for CSI image database retrieval.
Multi-feature fusion for Crime Scene Investigation image retrieval
- Authors: Liu, Ying , Hu, Dan , Fan, Jiulun , Wang, Fuping , Zhang, Dengsheng
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 International Conference on Digital Image Computing : Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 865-871
- Full Text: false
- Reviewed:
- Description: Based on a large scale crime scene investigation (CSI) image database, an effective and efficient CSI image retrieval system has been proposed to empower the investigative work of the police force. The main contribution of this paper includes: (1) a DCT domain texture feature extraction algorithm is proposed for CSI images, which is shown to be simple and effective. (2) the use of GIST descriptor on CSI images for the first time and combined with color histogram and the DCT domain texture feature as a fused feature, which describes CSI images from different aspects including color, texture, and scene content. Experimental results prove that the proposed method is effective for CSI image retrieval.
Combining pyramid match kernel and spatial pyramid for image classification
- Authors: Karmakar, Priyabrata , Teng, Shyh , Zhang, Dengsheng , Lu, Guojun , Liu, Ying
- Date: 2016
- Type: Text
- Relation: 2016 International Conference on Digital Image Computing: Techniques and Applications (Dicta); Gold Coast, Australia; 30th November-2nd December 2016 p. 486-493
- Full Text: false
- Reviewed:
- Description: This paper proposes a new approach for image classification by combining pyramid match kernel (PMK) with spatial pyramid. Unlike the conventional spatial pyramid matching (SPM) approach which only uses a single-resolution feature vector to represent an image, we use a multi-resolution feature vector to represent an image for SPM. We then calculate the match scores at each resolution of SPM representation and finally compute the matching between two images by applying the concept of PMK using the match scores obtained from the multiple resolutions. Our experimental results show that the proposed combined pyramid matching achieves a significant improvement on classification performance.
Extracting road centrelines from binary road images by optimizing geodesic lines
- Authors: Zhou, Shaoguang , Lu, Guojun , Teng, Shyh , Zhang, Dengsheng
- Date: 2016
- Type: Text , Conference proceedings , Conference paper
- Relation: 2015 International Conference on Image and Vision Computing New Zealand, IVCNZ 2015; Auckland, New Zealand; 23rd-24th November 2015 Vol. 2016-November, p. 1-6
- Full Text: false
- Reviewed:
- Description: Binary road images can be obtained from remotely sensed images with the aid of classification and segmentation techniques. Extracting road centrelines from these binary images are crucial to update a Geographic Information System (GIS) database. A current state of art method of centreline extraction needs to remove road junctions and depends on the accuracy of the endpoints, leading to three main limitations: (1) causing small gaps in the roads, (2) wrongly treating short non-road segments as roads, and (3) producing centrelines of low accuracy around the road end regions. To overcome these limitations, we propose to use an iteratively searching scheme to obtain the longest geodesic line in the preprocessed road skeleton images. Several image pixels at each end of the geodesic lines were removed to avoid noise, and the remaining parts were optimized using a dynamic programming snake model. The proposed method is applied to three types of binary road images and compared with the state of art method. It shows that the proposed method is less affected by the end regions of the roads, and is effective in filling the gaps in the roads. It also has an advantage on processing short non-road segments. © 2015 IEEE.
- Description: International Conference Image and Vision Computing New Zealand
Rotation invariant spatial pyramid matching for image classification
- Authors: Karmakar, Priyabrata , Teng, Shyh , Lu, Guojun , Zhang, Dengsheng
- Date: 2015
- Type: Text , Conference proceedings
- Full Text: false
- Description: This paper proposes a new Spatial Pyramid representation approach for image classification. Unlike the conventional Spatial Pyramid, the proposed method is invariant to rotation changes in the images. This method works by partitioning an image into concentric rectangles and organizing them into a pyramid. Each pyramidal region is then represented using a histogram of visual words. Our experimental results show that our proposed method significantly outperforms the conventional method. © 2015 IEEE.
Efficient nonlinear classification via low-rank regularised least squares
- Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
- Date: 2013
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 22, no. 7-8(2013), p. 1279-1289
- Full Text: false
- Reviewed:
- Description: We revisit the classical technique of regularised least squares (RLS) for nonlinear classification in this paper. Specifically, we focus on a low-rank formulation of the RLS, which has linear time complexity in the size of data set only, independent of both the number of classes and number of features. This makes low-rank RLS particularly suitable for problems with large data and moderate feature dimensions. Moreover, we have proposed a general theorem for obtaining the closed-form estimation of prediction values on a holdout validation set given the low-rank RLS classifier trained on the whole training data. It is thus possible to obtain an error estimate for each parameter setting without retraining and greatly accelerate the process of cross-validation for parameter selection. Experimental results on several large-scale benchmark data sets have shown that low-rank RLS achieves comparable classification performance while being much more efficient than standard kernel SVM for nonlinear classification. The improvement in efficiency is more evident for data sets with higher dimensions.