Bezier curve-based generic shape encoder
- Authors: Sohel, Ferdous , Karmakar, Gour , Dooley, Laurence , Bennamoun, M.
- Date: 2010
- Type: Text , Journal article
- Relation: IET Image Processing Vol. 4, no. 2 (2010), p. 92-102
- Full Text: false
- Reviewed:
- Description: Existing Bezier curve-based shape description techniques primarily focus upon determining a set of pertinent control points (CP) to represent a particular shape contour. While many different approaches have been proposed, none adequately consider domain-specific information about the shape contour like its gradualness and sharpness, in the CP generation process which can potentially result in large distortions in the object's shape representation. This study introduces a novel Bezier curve-based generic shape encoder (BCGSE) that partitions an object contour into contiguous segments based upon its cornerity, before generating the CP for each segment using relevant shape curvature information. In addition, although CP encoding has generally been ignored, BCGSE embeds an efficient vertex-based encoding strategy exploiting the latent equidistance between consecutive CP. A non-linear optimisation technique is also presented to enable the encoder is automatically adapt to bit-rate constraints. The performance of the BCGSE framework has been rigorously tested on a variety of diverse arbitrary shapes from both a distortion and requisite bit-rate perspective, with qualitative and quantitative results corroborating its superiority over existing shape descriptors.
Image-dependent spatial shape-error concealment
- Authors: Sohel, Ferdous , Karmakar, Gour , Dooley, Laurence
- Date: 2008
- Type: Text , Conference paper
- Relation: Signal Processing, 2008. ICSP 2008. 9th International Conference
- Full Text: false
- Reviewed:
- Description: Existing spatial shape-error concealment techniques are broadly based upon either parametric curves that exploit geometric information concerning a shapepsilas contour or object shape statistics using a combination of Markov random fields and maximum a posteriori estimation. Both categories are to some extent, able to mask errors caused by information loss, provided the shape is considered independently of the image/video. They palpably however, do not afford the best solution in applications where shape is used as metadata to describe image and video content. This paper presents a novel image-dependent spatial shape-error concealment (ISEC) algorithm that uses both image and shape information by employing the established rubber-band contour detecting function, with the novel enhancement of automatically determining the optimal width of the band to achieve superior error concealment. Experimental results qualitatively and numerically corroborate the enhanced performance of the new ISEC strategy compared with established shape-based concealment techniques.
Video coding for mobile communications
- Authors: Sohel, Ferdous , Karmakar, Gour , Dooley, Laurence
- Date: 2008
- Type: Text , Book chapter
- Relation: Mobile Multimedia Communications: Concepts, Applications, and Challenges p. 109-150
- Full Text: false
- Reviewed:
Geometric distortion measurement for shape coding: a contemporary review
- Authors: Sohel, Ferdous , Karmakar, Gour , Dooley, Laurence , Bennamoun, M.
- Date: 2011
- Type: Text , Journal article
- Relation: ACM Computing Surveys Vol. 43, no. 4 (2011), p. 1-22
- Full Text: false
- Reviewed:
- Description: Geometric distortion measurement and the associated metrics involved are integral to the Rate Distortion (RD) shape coding framework, with importantly the efficacy of the metrics being strongly influenced by the underlying measurement strategy. This has been the catalyst for many different techniques with this article presenting a comprehensive review of geometric distortion measurement, the diverse metrics applied, and their impact on shape coding. The respective performance of these measuring strategies is analyzed from both a RD and complexity perspective, with a recent distortion measurement technique based on arc-length-parameterization being comparatively evaluated. Some contemporary research challenges are also investigated, including schemes to effectively quantify shape deformation.
Quasi-Bezier curves integrating localised information
- Authors: Sohel, Ferdous , Karmakar, Gour , Dooley, Laurence , Arkinstall, John
- Date: 2008
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 41, no. 2 (2008), p. 531-542
- Full Text: false
- Reviewed:
- Description: Bezier curves (BC) have become fundamental tools in many challenging and varied applications, ranging from computer-aided geometric design to generic object shape descriptors. A major limitation of the classical Bezier curve, however, is that only global information about its control points (CP) is considered, so there can often be a large gap between the curve and its control polygon, leading to large distortion in shape representation. While strategies such as degree elevation, composite BC, refinement and subdivision reduce this gap, they also increase the number of CP and hence bit-rate, and computational complexity. This paper presents novel contributions to BC theory, with the introduction of quasi-Bezier curves (QBC), which seamlessly integrate localised CP information into the inherent global Bezier framework, with no increase in either the number of CP or order of computational complexity. QBC crucially retains the core properties of the classical BC, such as geometric continuity and affine invariance, and can be embedded into the vertex-based shape coding and shape descriptor framework to enhance rate-distortion performance. The performance of QBC has been empirically tested upon a number of natural and synthetically shaped objects, with both qualitative and quantitative results confirming its consistently superior approximation performance in comparison with both the classical BC and other established BC-based shape descriptor methods.
Sliding-window designs for vertex-based shape coding
- Authors: Sohel, Ferdous , Karmakar, Gour , Dooley, Laurence , Bennamoun, M.
- Date: 2012
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia Vol. 14, no. 3 (June 2012), p. 683-692
- Full Text: false
- Reviewed:
- Description: Traditionally the sliding window (SW) has been employed in vertex-based operational rate distortion (ORD) optimal shape coding algorithms to ensure consistent distortion (quality) measurement and improve computational efficiency. It also regulates the memory requirements for an encoder design enabling regular, symmetrical hardware implementations. This paper presents a series of new enhancements to existing techniques for determining the best SW-length within a rate-distortion (RD) framework, and analyses the nexus between SW-length and storage for ORD hardware realizations. In addition, it presents an efficient bit-allocation strategy for managing multiple shapes together with a generalized adaptive SW scheme which integrates localized curvature information (cornerity) on contour points with a bi-directional spatial distance, to afford a superior and more pragmatic SW design compared with existing adaptive SW solutions which are based on only cornerity values. Experimental results consistently corroborate the effectiveness of these new strategies.
Dynamic Bezier curves for variable rate-distortion
- Authors: Sohel, Ferdous , Karmakar, Gour , Dooley, Laurence
- Date: 2008
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 41, no. 10 (2008), p. 3153-3165
- Full Text: false
- Reviewed:
- Description: Bezier curves (BC) are important tools in a wide range of diverse and challenging applications, from computer-aided design to generic object shape descriptors. A major constraint of the classical BC is that only global information concerning control points (CP) is considered, consequently there may be a sizeable gap between the BC and its control polygon (CtrlPoly), leading to a large distortion in shape representation. While BC variants like degree elevation, composite BC and refinement and subdivision narrow this gap, they increase the number of CP and thereby both the required bit-rate and computational complexity. In addition, while quasi-Bezier curves (QBC) close the gap without increasing the number of CP, they reduce the underlying distortion by only a fixed amount. This paper presents a novel contribution to BC theory, with the introduction of a dynamic Bezier curve (DBC) model, which embeds variable localised CP information into the inherently global Bezier framework, by strategically moving BC points towards the CtrlPoly. A shifting parameter (SP) is defined that enables curves lying within the region between the BC and CtrlPoly to be generated, with no commensurate increase in CP. DBC provides a flexible rate-distortion (RD) criterion for shape coding applications, with a theoretical model for determining the optimal SP value for any admissible distortion being formulated. Crucially DBC retains core properties of the classical BC, including the convex hull and affine invariance, and can be seamlessly integrated into both the vertex-based shape coding and shape descriptor frameworks to improve their RD performance. DBC has been empirically tested upon a number of natural and synthetically shaped objects, with qualitative and quantitative results confirming its consistently superior shape approximation performance, compared with the classical BC, QBC and other established BC-based shape descriptor techniques.
Integrated generalized zero-shot learning for fine-grained classification
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 122, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Embedding learning (EL) and feature synthesizing (FS) are two of the popular categories of fine-grained GZSL methods. EL or FS using global features cannot discriminate fine details in the absence of local features. On the other hand, EL or FS methods exploiting local features either neglect direct attribute guidance or global information. Consequently, neither method performs well. In this paper, we propose to explore global and direct attribute-supervised local visual features for both EL and FS categories in an integrated manner for fine-grained GZSL. The proposed integrated network has an EL sub-network and a FS sub-network. Consequently, the proposed integrated network can be tested in two ways. We propose a novel two-step dense attention mechanism to discover attribute-guided local visual features. We introduce new mutual learning between the sub-networks to exploit mutually beneficial information for optimization. Moreover, we propose to compute source-target class similarity based on mutual information and transfer-learn the target classes to reduce bias towards the source domain during testing. We demonstrate that our proposed method outperforms contemporary methods on benchmark datasets. © 2021 Elsevier Ltd
Enhanced transfer learning with ImageNet trained classification layer
- Authors: Shermin, Tasfia , Teng, Shyh Wei , Murshed, Manzur , Lu, Guojun , Sohel, Ferdous , Paul, Manoranjan
- Date: 2019
- Type: Text , Book chapter
- Relation: Image and Video Technology Chapter 12 p. 142-1455
- Full Text: false
- Reviewed:
- Description: Parameter fine tuning is a transfer learning approach whereby learned parameters from pre-trained source network are transferred to the target network followed by fine-tuning. Prior research has shown that this approach is capable of improving task performance. However, the impact of the ImageNet pre-trained classification layer in parameter fine-tuning is mostly unexplored in the literature. In this paper, we propose a fine-tuning approach with the pre-trained classification layer. We employ layer-wise fine-tuning to determine which layers should be frozen for optimal performance. Our empirical analysis demonstrates that the proposed fine-tuning performs better than traditional fine-tuning. This finding indicates that the pre-trained classification layer holds less category-specific or more global information than believed earlier. Thus, we hypothesize that the presence of this layer is crucial for growing network depth to adapt better to a new task. Our study manifests that careful normalization and scaling are essential for creating harmony between the pre-trained and new layers for target domain adaptation. We evaluate the proposed depth augmented networks for fine-tuning on several challenging benchmark datasets and show that they can achieve higher classification accuracy than contemporary transfer learning approaches.
Robust image classification using a low-pass activation function and DCT augmentation
- Authors: Hossain, Md Tahmid , Teng, Shyh , Sohel, Ferdous , Lu, Guojun
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Access Vol. 9, no. (2021), p. 86460-86474
- Full Text:
- Reviewed:
- Description: Convolutional Neural Network's (CNN's) performance disparity on clean and corrupted datasets has recently come under scrutiny. In this work, we analyse common corruptions in the frequency domain, i.e., High Frequency corruptions (HFc, e.g., noise) and Low Frequency corruptions (LFc, e.g., blur). Although a simple solution to HFc is low-pass filtering, ReLU - a widely used Activation Function (AF), does not have any filtering mechanism. In this work, we instill low-pass filtering into the AF (LP-ReLU) to improve robustness against HFc. To deal with LFc, we complement LP-ReLU with Discrete Cosine Transform based augmentation. LP-ReLU, coupled with DCT augmentation, enables a deep network to tackle the entire spectrum of corruption. We use CIFAR-10-C and Tiny ImageNet-C for evaluation and demonstrate improvements of 5% and 7.3% in accuracy respectively, compared to the State-Of-The-Art (SOTA). We further evaluate our method's stability on a variety of perturbations in CIFAR-10-P and Tiny ImageNet-P, achieving new SOTA in these experiments as well. To further strengthen our understanding regarding CNN's lack of robustness, a decision space visualisation process is proposed and presented in this work. © 2013 IEEE.
Adversarial network with multiple classifiers for open set domain adaptation
- Authors: Shermin, Tasfia , Lu, Guojun , Teng, Shyh , Murshed, Manzur , Sohel, Ferdous
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Transactions on Multimedia Vol. 23, no. (2021), p. 2732-2744
- Full Text:
- Reviewed:
- Description: Domain adaptation aims to transfer knowledge from a domain with adequate labeled samples to a domain with scarce labeled samples. Prior research has introduced various open set domain adaptation settings in the literature to extend the applications of domain adaptation methods in real-world scenarios. This paper focuses on the type of open set domain adaptation setting where the target domain has both private ('unknown classes') label space and the shared ('known classes') label space. However, the source domain only has the 'known classes' label space. Prevalent distribution-matching domain adaptation methods are inadequate in such a setting that demands adaptation from a smaller source domain to a larger and diverse target domain with more classes. For addressing this specific open set domain adaptation setting, prior research introduces a domain adversarial model that uses a fixed threshold for distinguishing known from unknown target samples and lacks at handling negative transfers. We extend their adversarial model and propose a novel adversarial domain adaptation model with multiple auxiliary classifiers. The proposed multi-classifier structure introduces a weighting module that evaluates distinctive domain characteristics for assigning the target samples with weights which are more representative to whether they are likely to belong to the known and unknown classes to encourage positive transfers during adversarial training and simultaneously reduces the domain gap between the shared classes of the source and target domains. A thorough experimental investigation shows that our proposed method outperforms existing domain adaptation methods on a number of domain adaptation datasets. © 1999-2012 IEEE.
Bidirectional mapping coupled GAN for generalized zero-shot learning
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Image Processing Vol. 31, no. (2022), p. 721-733
- Full Text:
- Reviewed:
- Description: Bidirectional mapping-based generalized zero-shot learning (GZSL) methods rely on the quality of synthesized features to recognize seen and unseen data. Therefore, learning a joint distribution of seen-unseen classes and preserving the distinction between seen-unseen classes is crucial for GZSL methods. However, existing methods only learn the underlying distribution of seen data, although unseen class semantics are available in the GZSL problem setting. Most methods neglect retaining seen-unseen classes distinction and use the learned distribution to recognize seen and unseen data. Consequently, they do not perform well. In this work, we utilize the available unseen class semantics alongside seen class semantics and learn joint distribution through a strong visual-semantic coupling. We propose a bidirectional mapping coupled generative adversarial network (BMCoGAN) by extending the concept of the coupled generative adversarial network into a bidirectional mapping model. We further integrate a Wasserstein generative adversarial optimization to supervise the joint distribution learning. We design a loss optimization for retaining distinctive information of seen-unseen classes in the synthesized features and reducing bias towards seen classes, which pushes synthesized seen features towards real seen features and pulls synthesized unseen features away from real seen features. We evaluate BMCoGAN on benchmark datasets and demonstrate its superior performance against contemporary methods. © 1992-2012 IEEE.
Renewable energy-based energy-efficient off-grid base stations for heterogeneous network
- Authors: Islam, Khondoker , Hossain, Md Sanwar , Amin, B.M. Ruhul , Shafiullah, G. , Sohel, Ferdous
- Date: 2023
- Type: Text , Journal article
- Relation: Energies Vol. 16, no. 1 (2023), p.
- Full Text:
- Reviewed:
- Description: The heterogeneous network (HetNet) is a specified cellular platform to tackle the rapidly growing anticipated data traffic. From a communications perspective, data loads can be mapped to energy loads that are generally placed on the operator networks. Meanwhile, renewable energy-aided networks offer to curtailed fossil fuel consumption, so to reduce the environmental pollution. This paper proposes a renewable energy based power supply architecture for the off-grid HetNet using a novel energy sharing model. Solar photovoltaics (PV) along with sufficient energy storage devices are used for each macro, micro, pico, or femto base station (BS). Additionally, a biomass generator (BG) is used for macro and micro BSs. The collocated macro and micro BSs are connected through end-to-end resistive lines. A novel-weighted proportional-fair resource-scheduling algorithm with sleep mechanisms is proposed for non-real time (NRT) applications by trading-off the power consumption and communication delays. Furthermore, the proposed algorithm with an extended discontinuous reception (eDRX) and power saving mode (PSM) for narrowband internet of things (IoT) applications extends the battery lifetime for IoT devices. HOMER optimization software is used to perform optimal system architecture, economic, and carbon footprint analyses while the Monte-Carlo simulation tool is used for evaluating the throughput and energy efficiency performances. The proposed algorithms are validated through the practical data of the rural areas of Bangladesh from which it is evident that the proposed power supply architecture is energy-efficient, cost-effective, reliable, and eco-friendly. © 2022 by the authors.
Anti-aliasing deep image classifiers using novel depth adaptive blurring and activation function
- Authors: Hossain, Md Tahmid , Teng, Shyh , Lu, Guojun , Rahman, Mohammad Arifur , Sohel, Ferdous
- Date: 2023
- Type: Text , Journal article
- Relation: Neurocomputing Vol. 536, no. (2023), p. 164-174
- Full Text: false
- Reviewed:
- Description: Deep convolutional networks are vulnerable to image translation or shift, partly due to common down-sampling layers, e.g., max-pooling and strided convolution. These operations violate the Nyquist sampling rate and cause aliasing. The textbook solution is low-pass filtering (blurring) before down-sampling, which can benefit deep networks as well. Even so, non-linearity units, such as ReLU, often re-introduce the problem, suggesting that blurring alone may not suffice. In this work, first, we analyse deep features with Fourier transform and show that Depth Adaptive Blurring is more effective, as opposed to monotonic blurring. To this end, we propose a novel Depth Adaptive Blur-pool (DAB-pool) module to replace existing down-sampling methods. Second, we introduce a novel activation function – with a built-in low pass filter, as an additional measure, to keep the problem from reappearing. From experiments, we observe generalisation on other forms of transformations and corruptions as well, e.g., rotation, scale, and noise. We evaluate our method under three challenging settings: (1) a variety of image translations; (2) adversarial attacks – both