Pervasive blood pressure monitoring using Photoplethysmogram (PPG) sensor
- Riaz, Farhan, Azad, Muhammad, Arshad, Junaid, Imran, Muhammad, Hassan, Ali, Rehman, Saad
- Authors: Riaz, Farhan , Azad, Muhammad , Arshad, Junaid , Imran, Muhammad , Hassan, Ali , Rehman, Saad
- Date: 2019
- Type: Text , Journal article
- Relation: Future Generation Computer Systems Vol. 98, no. (2019), p. 120-130
- Full Text:
- Reviewed:
- Description: Preventive healthcare requires continuous monitoring of the blood pressure (BP) of patients, which is not feasible using conventional methods. Photoplethysmogram (PPG) signals can be effectively used for this purpose as there is a physiological relation between the pulse width and BP and can be easily acquired using a wearable PPG sensor. However, developing real-time algorithms for wearable technology is a significant challenge due to various conflicting requirements such as high accuracy, computationally constrained devices, and limited power supply. In this paper, we propose a novel feature set for continuous, real-time identification of abnormal BP. This feature set is obtained by identifying the peaks and valleys in a PPG signal (using a peak detection algorithm), followed by the calculation of rising time, falling time and peak-to-peak distance. The histograms of these times are calculated to form a feature set that can be used for classification of PPG signals into one of the two classes: normal or abnormal BP. No public dataset is available for such study and therefore a prototype is developed to collect PPG signals alongside BP measurements. The proposed feature set shows very good performance with an overall accuracy of approximately 95%. Although the proposed feature set is effective, the significance of individual features varies greatly (validated using significance testing) which led us to perform weighted voting of features for classification by performing autoregressive modeling. Our experiments show that the simplest linear classifiers produce very good results indicating the strength of the proposed feature set. The weighted voting improves the results significantly, producing an overall accuracy of about 98%. Conclusively, the PPG signals can be effectively used to identify BP, and the proposed feature set is efficient and computationally feasible for implementation on standalone devices. © 2019 Elsevier B.V.
- Authors: Riaz, Farhan , Azad, Muhammad , Arshad, Junaid , Imran, Muhammad , Hassan, Ali , Rehman, Saad
- Date: 2019
- Type: Text , Journal article
- Relation: Future Generation Computer Systems Vol. 98, no. (2019), p. 120-130
- Full Text:
- Reviewed:
- Description: Preventive healthcare requires continuous monitoring of the blood pressure (BP) of patients, which is not feasible using conventional methods. Photoplethysmogram (PPG) signals can be effectively used for this purpose as there is a physiological relation between the pulse width and BP and can be easily acquired using a wearable PPG sensor. However, developing real-time algorithms for wearable technology is a significant challenge due to various conflicting requirements such as high accuracy, computationally constrained devices, and limited power supply. In this paper, we propose a novel feature set for continuous, real-time identification of abnormal BP. This feature set is obtained by identifying the peaks and valleys in a PPG signal (using a peak detection algorithm), followed by the calculation of rising time, falling time and peak-to-peak distance. The histograms of these times are calculated to form a feature set that can be used for classification of PPG signals into one of the two classes: normal or abnormal BP. No public dataset is available for such study and therefore a prototype is developed to collect PPG signals alongside BP measurements. The proposed feature set shows very good performance with an overall accuracy of approximately 95%. Although the proposed feature set is effective, the significance of individual features varies greatly (validated using significance testing) which led us to perform weighted voting of features for classification by performing autoregressive modeling. Our experiments show that the simplest linear classifiers produce very good results indicating the strength of the proposed feature set. The weighted voting improves the results significantly, producing an overall accuracy of about 98%. Conclusively, the PPG signals can be effectively used to identify BP, and the proposed feature set is efficient and computationally feasible for implementation on standalone devices. © 2019 Elsevier B.V.
A machine learning approach for prediction of pregnancy outcome following IVF treatment
- Hassan, Md Rafiul, Al-Insaif, Sadiq, Hossain, Muhammad, Kamruzzaman, Joarder
- Authors: Hassan, Md Rafiul , Al-Insaif, Sadiq , Hossain, Muhammad , Kamruzzaman, Joarder
- Date: 2020
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 32, no. 7 (2020), p. 2283-2297
- Full Text: false
- Reviewed:
- Description: Infertility affects one out of seven couples around the world. Therefore, the best possible management of the in vitro fertilization (IVF) treatment and patient advice is crucial for both patients and medical practitioners. The ultimate concern of the patients is the success of an IVF procedure, which depends on a number of influencing attributes. Without any automated tool, it is hard for the practitioners to assess any influencing trend of the attributes and factors that might lead to a successful IVF pregnancy. This paper proposes a hill climbing feature (attribute) selection algorithm coupled with automated classification using machine learning techniques with the aim to analyze and predict IVF pregnancy in greater accuracy. Using 25 attributes, we assessed the prediction ability of IVF pregnancy success for five different machine learning models, namely multilayer perceptron (MLP), support vector machines (SVM), C4.5, classification and regression trees (CART) and random forest (RF). The prediction ability was measured in terms of widely used performance metrics, namely accuracy rate, F-measure and AUC. Feature selection algorithm reduced the number of most influential attributes to nineteen for MLP, sixteen for RF, seventeen for SVM, twelve for C4.5 and eight for CART. Overall, the most influential attributes identified are: ‘age’, ‘indication’ of fertility factor, ‘Antral Follicle Counts (AFC)’, ‘NbreM2’, ‘method of sperm collection’, ‘Chamotte’, ‘Fertilization rate in vitro’, ‘Follicles on day 14’ and ‘Embryo transfer day.’ The machine learning models trained with the selected set of features significantly improved the prediction accuracy of IVF pregnancy success to a level considerably higher than those reported in the current literature. © 2018, The Natural Computing Applications Forum.
Action research to implement an Indigenous health curriculum framework
- Wilson, Cath, Heinrich, Liesl, Heidari, Parvaneh, Adams, Karen
- Authors: Wilson, Cath , Heinrich, Liesl , Heidari, Parvaneh , Adams, Karen
- Date: 2020
- Type: Text , Journal article
- Relation: Nurse education today Vol. 91, no. (2020), p. 104464-104464
- Full Text: false
- Reviewed:
- Description: In recent decades Indigenous health curriculum frameworks have been developed, however, few studies about their implementation exist. This study aimed to employ critical theory and action research to understand how an Indigenous health curriculum framework could be applied and associated learning and teaching iteratively improved. Three action research cycles where conducted from 2017 to 2019. Student reaction (satisfaction and engagement) was collected via survey 2017–2019. Student learning was collated 2018–2019 via self-perception survey (knowledge, attitude, confidence, commitment) multi-choice questions (knowledge) and content analysis of apply and analyse activities (skill). The teaching team met annually to reflect on findings and plan enhancements to learning and teaching. Over 2017–2019 there was a pattern of improved student reaction and learning. Connecting this research to Faculty level committees led to widening success and improved sustainability of the practice. The online unit and workshop delivery were scalable, overcame a barrier of educator skill and confidence to teach this area, allowed for quality content control and provided data for analysis. Interestingly, learning gained from this unit matched that described as occurring from student placements in health settings with high numbers of Indigenous people. Student learning occurred across the Framework three levels (novice, intermediate and entry to practice) suggesting that the taxonomy of the Framework does not necessarily align with the reality of learning and teaching. Vertical implementation of the five learning domains would benefit from alignment with training evaluation models and validated assessment to understand learning that has occurred rather than the teaching that has been taught. In this study health profession accreditation bodies had driven the imperative for an Indigenous health program and curriculum. Research on Indigenous health learning and teaching relating to behaviour and results in workplaces is needed.
Enhancing linear time complexity time series classification with hybrid bag-of-patterns
- Liang, Shen, Zhang, Yanchun, Ma, Jiangang
- Authors: Liang, Shen , Zhang, Yanchun , Ma, Jiangang
- Date: 2020
- Type: Text , Conference paper
- Relation: 25th International Conference on Database Systems for Advanced Applications, DASFAA 2020 Vol. 12112 LNCS, p. 717-735
- Full Text: false
- Reviewed:
- Description: In time series classification, one of the most popular models is Bag-Of-Patterns (BOP). Most BOP methods run in super-linear time. A recent work proposed a linear time BOP model, yet it has limited accuracy. In this work, we present Hybrid Bag-Of-Patterns (HBOP), which can greatly enhance accuracy while maintaining linear complexity. Concretely, we first propose a novel time series discretization method called SLA, which can retain more information than the classic SAX. We use a hybrid of SLA and SAX to expressively and compactly represent subsequences, which is our most important design feature. Moreover, we develop an efficient time series transformation method that is key to achieving linear complexity. We also propose a novel X-means clustering subroutine to handle subclasses. Extensive experiments on over 100 datasets demonstrate the effectiveness and efficiency of our method. © 2020, Springer Nature Switzerland AG.
Fraud detection for online banking for scalable and distributed data
- Authors: Haq, Ikram
- Date: 2020
- Type: Text , Thesis , PhD
- Full Text:
- Description: Online fraud causes billions of dollars in losses for banks. Therefore, online banking fraud detection is an important field of study. However, there are many challenges in conducting research in fraud detection. One of the constraints is due to unavailability of bank datasets for research or the required characteristics of the attributes of the data are not available. Numeric data usually provides better performance for machine learning algorithms. Most transaction data however have categorical, or nominal features as well. Moreover, some platforms such as Apache Spark only recognizes numeric data. So, there is a need to use techniques e.g. One-hot encoding (OHE) to transform categorical features to numerical features, however OHE has challenges including the sparseness of transformed data and that the distinct values of an attribute are not always known in advance. Efficient feature engineering can improve the algorithm’s performance but usually requires detailed domain knowledge to identify correct features. Techniques like Ripple Down Rules (RDR) are suitable for fraud detection because of their low maintenance and incremental learning features. However, high classification accuracy on mixed datasets, especially for scalable data is challenging. Evaluation of RDR on distributed platforms is also challenging as it is not available on these platforms. The thesis proposes the following solutions to these challenges: • We developed a technique Highly Correlated Rule Based Uniformly Distribution (HCRUD) to generate highly correlated rule-based uniformly-distributed synthetic data. • We developed a technique One-hot Encoded Extended Compact (OHE-EC) to transform categorical features to numeric features by compacting sparse-data even if all distinct values are unknown. • We developed a technique Feature Engineering and Compact Unified Expressions (FECUE) to improve model efficiency through feature engineering where the domain of the data is not known in advance. • A Unified Expression RDR fraud deduction technique (UE-RDR) for Big data has been proposed and evaluated on the Spark platform. Empirical tests were executed on multi-node Hadoop cluster using well-known classifiers on bank data, synthetic bank datasets and publicly available datasets from UCI repository. These evaluations demonstrated substantial improvements in terms of classification accuracy, ruleset compactness and execution speed.
- Description: Doctor of Philosophy
- Authors: Haq, Ikram
- Date: 2020
- Type: Text , Thesis , PhD
- Full Text:
- Description: Online fraud causes billions of dollars in losses for banks. Therefore, online banking fraud detection is an important field of study. However, there are many challenges in conducting research in fraud detection. One of the constraints is due to unavailability of bank datasets for research or the required characteristics of the attributes of the data are not available. Numeric data usually provides better performance for machine learning algorithms. Most transaction data however have categorical, or nominal features as well. Moreover, some platforms such as Apache Spark only recognizes numeric data. So, there is a need to use techniques e.g. One-hot encoding (OHE) to transform categorical features to numerical features, however OHE has challenges including the sparseness of transformed data and that the distinct values of an attribute are not always known in advance. Efficient feature engineering can improve the algorithm’s performance but usually requires detailed domain knowledge to identify correct features. Techniques like Ripple Down Rules (RDR) are suitable for fraud detection because of their low maintenance and incremental learning features. However, high classification accuracy on mixed datasets, especially for scalable data is challenging. Evaluation of RDR on distributed platforms is also challenging as it is not available on these platforms. The thesis proposes the following solutions to these challenges: • We developed a technique Highly Correlated Rule Based Uniformly Distribution (HCRUD) to generate highly correlated rule-based uniformly-distributed synthetic data. • We developed a technique One-hot Encoded Extended Compact (OHE-EC) to transform categorical features to numeric features by compacting sparse-data even if all distinct values are unknown. • We developed a technique Feature Engineering and Compact Unified Expressions (FECUE) to improve model efficiency through feature engineering where the domain of the data is not known in advance. • A Unified Expression RDR fraud deduction technique (UE-RDR) for Big data has been proposed and evaluated on the Spark platform. Empirical tests were executed on multi-node Hadoop cluster using well-known classifiers on bank data, synthetic bank datasets and publicly available datasets from UCI repository. These evaluations demonstrated substantial improvements in terms of classification accuracy, ruleset compactness and execution speed.
- Description: Doctor of Philosophy
Imbalanced data classification and its application in cyber security
- Authors: Moniruzzaman, Md
- Date: 2020
- Type: Text , Thesis , PhD
- Full Text:
- Description: Cyber security, also known as information technology security or simply as information security, aims to protect government organizations, companies and individuals by defending their computers, servers, electronic systems, networks, and data from malicious attacks. With the advancement of client-side on the fly web content generation techniques, it becomes easier for attackers to modify the content of a website dynamically and gain access to valuable information. The impact of cybercrime to the global economy is now more than ever, and it is growing day by day. Among various types of cybercrimes, financial attacks are widely spread and the financial sector is among most targeted. Both corporations and individuals are losing a huge amount of money each year. The majority portion of financial attacks is carried out by banking malware and web-based attacks. The end users are not always skilled enough to differentiate between injected content and actual contents of a webpage. Designing a real-time security system for ensuring a safe browsing experience is a challenging task. Some of the existing solutions are designed for client side and all the users have to install it in their system, which is very difficult to implement. In addition, various platforms and tools are used by organizations and individuals, therefore, different solutions are needed to be designed. The existing server-side solution often focuses on sanitizing and filtering the inputs. It will fail to detect obfuscated and hidden scripts. This is a realtime security system and any significant delay will hamper user experience. Therefore, finding the most optimized and efficient solution is very important. To ensure an easy installation and integration capabilities of any solution with the existing system is also a critical factor to consider. If the solution is efficient but difficult to integrate, then it may not be a feasible solution for practical use. Unsupervised and supervised data classification techniques have been widely applied to design algorithms for solving cyber security problems. The performance of these algorithms varies depending on types of cyber security problems and size of datasets. To date, existing algorithms do not achieve high accuracy in detecting malware activities. Datasets in cyber security and, especially those from financial sectors, are predominantly imbalanced datasets as the number of malware activities is significantly less than the number of normal activities. This means that classifiers for imbalanced datasets can be used to develop supervised data classification algorithms to detect malware activities. Development of classifiers for imbalanced data sets has been subject of research over the last decade. Most of these classifiers are based on oversampling and undersampling techniques and are not efficient in many situations as such techniques are applied globally. In this thesis, we develop two new algorithms for solving supervised data classification problems in imbalanced datasets and then apply them to solve malware detection problems. The first algorithm is designed using the piecewise linear classifiers by formulating this problem as an optimization problem and by applying the penalty function method. More specifically, we add more penalty to the objective function for misclassified points from minority classes. The second method is based on the combination of the supervised and unsupervised (clustering) algorithms. Such an approach allows one to identify areas in the input space where minority classes are located and to apply local oversampling or undersampling. This approach leads to the design of more efficient and accurate classifiers. The proposed algorithms are tested using real-world datasets. Results clearly demonstrate superiority of newly introduced algorithms. Then we apply these algorithms to design classifiers to detect malwares.
- Description: Doctor of Philosophy
- Authors: Moniruzzaman, Md
- Date: 2020
- Type: Text , Thesis , PhD
- Full Text:
- Description: Cyber security, also known as information technology security or simply as information security, aims to protect government organizations, companies and individuals by defending their computers, servers, electronic systems, networks, and data from malicious attacks. With the advancement of client-side on the fly web content generation techniques, it becomes easier for attackers to modify the content of a website dynamically and gain access to valuable information. The impact of cybercrime to the global economy is now more than ever, and it is growing day by day. Among various types of cybercrimes, financial attacks are widely spread and the financial sector is among most targeted. Both corporations and individuals are losing a huge amount of money each year. The majority portion of financial attacks is carried out by banking malware and web-based attacks. The end users are not always skilled enough to differentiate between injected content and actual contents of a webpage. Designing a real-time security system for ensuring a safe browsing experience is a challenging task. Some of the existing solutions are designed for client side and all the users have to install it in their system, which is very difficult to implement. In addition, various platforms and tools are used by organizations and individuals, therefore, different solutions are needed to be designed. The existing server-side solution often focuses on sanitizing and filtering the inputs. It will fail to detect obfuscated and hidden scripts. This is a realtime security system and any significant delay will hamper user experience. Therefore, finding the most optimized and efficient solution is very important. To ensure an easy installation and integration capabilities of any solution with the existing system is also a critical factor to consider. If the solution is efficient but difficult to integrate, then it may not be a feasible solution for practical use. Unsupervised and supervised data classification techniques have been widely applied to design algorithms for solving cyber security problems. The performance of these algorithms varies depending on types of cyber security problems and size of datasets. To date, existing algorithms do not achieve high accuracy in detecting malware activities. Datasets in cyber security and, especially those from financial sectors, are predominantly imbalanced datasets as the number of malware activities is significantly less than the number of normal activities. This means that classifiers for imbalanced datasets can be used to develop supervised data classification algorithms to detect malware activities. Development of classifiers for imbalanced data sets has been subject of research over the last decade. Most of these classifiers are based on oversampling and undersampling techniques and are not efficient in many situations as such techniques are applied globally. In this thesis, we develop two new algorithms for solving supervised data classification problems in imbalanced datasets and then apply them to solve malware detection problems. The first algorithm is designed using the piecewise linear classifiers by formulating this problem as an optimization problem and by applying the penalty function method. More specifically, we add more penalty to the objective function for misclassified points from minority classes. The second method is based on the combination of the supervised and unsupervised (clustering) algorithms. Such an approach allows one to identify areas in the input space where minority classes are located and to apply local oversampling or undersampling. This approach leads to the design of more efficient and accurate classifiers. The proposed algorithms are tested using real-world datasets. Results clearly demonstrate superiority of newly introduced algorithms. Then we apply these algorithms to design classifiers to detect malwares.
- Description: Doctor of Philosophy
New gene selection algorithm using hypeboxes to improve performance of classifiers
- Bagirov, Adil, Mardaneh, Karim
- Authors: Bagirov, Adil , Mardaneh, Karim
- Date: 2020
- Type: Text , Journal article
- Relation: International Journal of Bioinformatics Research and Applications Vol. 16, no. 3 (2020), p. 269-289
- Full Text: false
- Reviewed:
- Description: The use of DNA microarray technology allows to measure the expression levels of thousands of genes in one single experiment which makes possible to apply classification techniques to classify tumours. However, the large number of genes and relatively small number of tumours in gene expression datasets may (and in some cases significantly) diminish the accuracy of many classifiers. Therefore, efficient gene selection algorithms are required to identify most informative genes or groups of genes to improve the performance of classifiers. In this paper, a new gene selection algorithm is developed using marginal hyberboxes of genes or groups of genes for each tumour type. Informative genes are defined using overlaps between hyberboxes. The results on six gene expression datasets demonstrate that the proposed algorithm is able to considerably reduce the number of genes and significantly improve the performance of classifiers. © 2020 Inderscience Enterprises Ltd.
Towards robust convolutional neural networks in challenging environments
- Authors: Hossain, Md Tahmid
- Date: 2021
- Type: Text , Thesis , PhD
- Full Text:
- Description: Image classification is one of the fundamental tasks in the field of computer vision. Although Artificial Neural Network (ANN) showed a lot of promise in this field, the lack of efficient computer hardware subdued its potential to a great extent. In the early 2000s, advances in hardware coupled with better network design saw the dramatic rise of Convolutional Neural Network (CNN). Deep CNNs pushed the State-of-The-Art (SOTA) in a number of vision tasks, including image classification, object detection, and segmentation. Presently, CNNs dominate these tasks. Although CNNs exhibit impressive classification performance on clean images, they are vulnerable to distortions, such as noise and blur. Fine-tuning a pre-trained CNN on mutually exclusive or a union set of distortions is a brute-force solution. This iterative fine-tuning process with all known types of distortion is, however, exhaustive and the network struggles to handle unseen distortions. CNNs are also vulnerable to image translation or shift, partly due to common Down-Sampling (DS) layers, e.g., max-pooling and strided convolution. These operations violate the Nyquist sampling rate and cause aliasing. The textbook solution is low-pass filtering (blurring) before down-sampling, which can benefit deep networks as well. Even so, non-linearity units, such as ReLU, often re-introduce the problem, suggesting that blurring alone may not suffice. Another important but under-explored issue for CNNs is unknown or Open Set Recognition (OSR). CNNs are commonly designed for closed set arrangements, where test instances only belong to some ‘Known Known’ (KK) classes used in training. As such, they predict a class label for a test sample based on the distribution of the KK classes. However, when used under the OSR setup (where an input may belong to an ‘Unknown Unknown’ or UU class), such a network will always classify a test instance as one of the KK classes even if it is from a UU class. Historically, CNNs have struggled with detecting objects in images with large difference in scale, especially small objects. This is because the DS layers inside a CNN often progressively wipe out the signal from small objects. As a result, the final layers are left with no signature from these objects leading to degraded performance. In this work, we propose solutions to the above four problems. First, we improve CNN robustness against distortion by proposing DCT based augmentation, adaptive regularisation, and noise suppressing Activation Functions (AF). Second, to ensure further performance gain and robustness to image transformations, we introduce anti-aliasing properties inside the AF and propose a novel DS method called blurpool. Third, to address the OSR problem, we propose a novel training paradigm that ensures detection of UU classes and accurate classification of the KK classes. Finally, we introduce a novel CNN that enables a deep detector to identify small objects with high precision and recall. We evaluate our methods on a number of benchmark datasets and demonstrate that they outperform contemporary methods in the respective problem set-ups.
- Description: Doctor of Philosophy
- Authors: Hossain, Md Tahmid
- Date: 2021
- Type: Text , Thesis , PhD
- Full Text:
- Description: Image classification is one of the fundamental tasks in the field of computer vision. Although Artificial Neural Network (ANN) showed a lot of promise in this field, the lack of efficient computer hardware subdued its potential to a great extent. In the early 2000s, advances in hardware coupled with better network design saw the dramatic rise of Convolutional Neural Network (CNN). Deep CNNs pushed the State-of-The-Art (SOTA) in a number of vision tasks, including image classification, object detection, and segmentation. Presently, CNNs dominate these tasks. Although CNNs exhibit impressive classification performance on clean images, they are vulnerable to distortions, such as noise and blur. Fine-tuning a pre-trained CNN on mutually exclusive or a union set of distortions is a brute-force solution. This iterative fine-tuning process with all known types of distortion is, however, exhaustive and the network struggles to handle unseen distortions. CNNs are also vulnerable to image translation or shift, partly due to common Down-Sampling (DS) layers, e.g., max-pooling and strided convolution. These operations violate the Nyquist sampling rate and cause aliasing. The textbook solution is low-pass filtering (blurring) before down-sampling, which can benefit deep networks as well. Even so, non-linearity units, such as ReLU, often re-introduce the problem, suggesting that blurring alone may not suffice. Another important but under-explored issue for CNNs is unknown or Open Set Recognition (OSR). CNNs are commonly designed for closed set arrangements, where test instances only belong to some ‘Known Known’ (KK) classes used in training. As such, they predict a class label for a test sample based on the distribution of the KK classes. However, when used under the OSR setup (where an input may belong to an ‘Unknown Unknown’ or UU class), such a network will always classify a test instance as one of the KK classes even if it is from a UU class. Historically, CNNs have struggled with detecting objects in images with large difference in scale, especially small objects. This is because the DS layers inside a CNN often progressively wipe out the signal from small objects. As a result, the final layers are left with no signature from these objects leading to degraded performance. In this work, we propose solutions to the above four problems. First, we improve CNN robustness against distortion by proposing DCT based augmentation, adaptive regularisation, and noise suppressing Activation Functions (AF). Second, to ensure further performance gain and robustness to image transformations, we introduce anti-aliasing properties inside the AF and propose a novel DS method called blurpool. Third, to address the OSR problem, we propose a novel training paradigm that ensures detection of UU classes and accurate classification of the KK classes. Finally, we introduce a novel CNN that enables a deep detector to identify small objects with high precision and recall. We evaluate our methods on a number of benchmark datasets and demonstrate that they outperform contemporary methods in the respective problem set-ups.
- Description: Doctor of Philosophy
- Nejati, Maryam, Amjady, Nima
- Authors: Nejati, Maryam , Amjady, Nima
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE transactions on sustainable energy Vol. 13, no. 2 (2022), p. 1188-1198
- Full Text: false
- Reviewed:
- Description: Solar generation systems are globally extending in terms of scale and number, which highlights the increasing importance of solar power forecast. In this paper, a day-ahead solar power prediction method is proposed including 1) a novel feature selecting/clustering approach based on relevancy and redundancy criteria and 2) an innovative hybrid-classification-regression forecasting engine. The proposed feature selecting/clustering approach filters out irrelevant features and partitions relevant features to two separate subsets to decrease the redundancy of features. Each of these two subsets is separately trained by one forecasting engine and the final solar power prediction of the proposed method is obtained by a relevancy-based combination of these two forecasts. The proposed forecasting engine classifies the historical data based on the learnability of its constituent regression models and assigns each class of training samples to one regression model. Each regression model predicts the outputs of the test samples that belong to its class. The effectiveness of the proposed solar power prediction method is illustrated by testing on two real-world solar farms.
Mineral texture identification using local binary patterns equipped with a Classification and Recognition Updating System (CARUS)
- Aligholi, Saeed, Khajavi, Reza, Khandelwal, Manoj, Armaghani, Danial
- Authors: Aligholi, Saeed , Khajavi, Reza , Khandelwal, Manoj , Armaghani, Danial
- Date: 2022
- Type: Text , Journal article
- Relation: Sustainability (Switzerland) Vol. 14, no. 18 (2022), p.
- Full Text:
- Reviewed:
- Description: In this paper, a rotation-invariant local binary pattern operator equipped with a local contrast measure (riLBPc) is employed to characterize the type of mineral twinning by inspecting the texture properties of crystals. The proposed method uses photomicrographs of minerals and produces LBP histograms, which might be compared with those included in a predefined database using the Kullback–Leibler divergence-based metric. The paper proposes a new LBP-based scheme for concurrent classification and recognition tasks, followed by a novel online updating routine to enhance the locally developed mineral LBP database. The discriminatory power of the proposed Classification and Recognition Updating System (CARUS) for texture identification scheme is verified for plagioclase, orthoclase, microcline, and quartz minerals with sensitivity (TPR) near 99.9%, 87%, 99.9%, and 96%, and accuracy (ACC) equal to about 99%, 97%, 99%, and 99%, respectively. According to the results, the introduced CARUS system is a promising approach that can be applied in a variety of different fields dealing with classification and feature recognition tasks. © 2022 by the authors.
- Authors: Aligholi, Saeed , Khajavi, Reza , Khandelwal, Manoj , Armaghani, Danial
- Date: 2022
- Type: Text , Journal article
- Relation: Sustainability (Switzerland) Vol. 14, no. 18 (2022), p.
- Full Text:
- Reviewed:
- Description: In this paper, a rotation-invariant local binary pattern operator equipped with a local contrast measure (riLBPc) is employed to characterize the type of mineral twinning by inspecting the texture properties of crystals. The proposed method uses photomicrographs of minerals and produces LBP histograms, which might be compared with those included in a predefined database using the Kullback–Leibler divergence-based metric. The paper proposes a new LBP-based scheme for concurrent classification and recognition tasks, followed by a novel online updating routine to enhance the locally developed mineral LBP database. The discriminatory power of the proposed Classification and Recognition Updating System (CARUS) for texture identification scheme is verified for plagioclase, orthoclase, microcline, and quartz minerals with sensitivity (TPR) near 99.9%, 87%, 99.9%, and 96%, and accuracy (ACC) equal to about 99%, 97%, 99%, and 99%, respectively. According to the results, the introduced CARUS system is a promising approach that can be applied in a variety of different fields dealing with classification and feature recognition tasks. © 2022 by the authors.
Subgraph adaptive structure-aware graph contrastive learning
- Chen, Zhikui, Peng, Yin, Yu, Shuo, Cao, Chen, Xia, Feng
- Authors: Chen, Zhikui , Peng, Yin , Yu, Shuo , Cao, Chen , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: Mathematics (Basel) Vol. 10, no. 17 (2022), p. 3047
- Full Text:
- Reviewed:
- Description: Graph contrastive learning (GCL) has been subject to more attention and been widely applied to numerous graph learning tasks such as node classification and link prediction. Although it has achieved great success and even performed better than supervised methods in some tasks, most of them depend on node-level comparison, while ignoring the rich semantic information contained in graph topology, especially for social networks. However, a higher-level comparison requires subgraph construction and encoding, which remain unsolved. To address this problem, we propose a subgraph adaptive structure-aware graph contrastive learning method (PASCAL) in this work, which is a subgraph-level GCL method. In PASCAL, we construct subgraphs by merging all motifs that contain the target node. Then we encode them on the basis of motif number distribution to capture the rich information hidden in subgraphs. By incorporating motif information, PASCAL can capture richer semantic information hidden in local structures compared with other GCL methods. Extensive experiments on six benchmark datasets show that PASCAL outperforms state-of-art graph contrastive learning and supervised methods in most cases.
- Authors: Chen, Zhikui , Peng, Yin , Yu, Shuo , Cao, Chen , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: Mathematics (Basel) Vol. 10, no. 17 (2022), p. 3047
- Full Text:
- Reviewed:
- Description: Graph contrastive learning (GCL) has been subject to more attention and been widely applied to numerous graph learning tasks such as node classification and link prediction. Although it has achieved great success and even performed better than supervised methods in some tasks, most of them depend on node-level comparison, while ignoring the rich semantic information contained in graph topology, especially for social networks. However, a higher-level comparison requires subgraph construction and encoding, which remain unsolved. To address this problem, we propose a subgraph adaptive structure-aware graph contrastive learning method (PASCAL) in this work, which is a subgraph-level GCL method. In PASCAL, we construct subgraphs by merging all motifs that contain the target node. Then we encode them on the basis of motif number distribution to capture the rich information hidden in subgraphs. By incorporating motif information, PASCAL can capture richer semantic information hidden in local structures compared with other GCL methods. Extensive experiments on six benchmark datasets show that PASCAL outperforms state-of-art graph contrastive learning and supervised methods in most cases.
An optimized hybrid deep intrusion detection model (HD-IDM) for enhancing network security
- Ahmad, Iftikhar, Imran, Muhammad, Qayyum, Abdul, Ramzan, Muhammad, Alassafi, Madini
- Authors: Ahmad, Iftikhar , Imran, Muhammad , Qayyum, Abdul , Ramzan, Muhammad , Alassafi, Madini
- Date: 2023
- Type: Text , Journal article
- Relation: Mathematics Vol. 11, no. 21 (2023), p.
- Full Text:
- Reviewed:
- Description: Detecting cyber intrusions in network traffic is a tough task for cybersecurity. Current methods struggle with the complexity of understanding patterns in network data. To solve this, we present the Hybrid Deep Learning Intrusion Detection Model (HD-IDM), a new way that combines GRU and LSTM classifiers. GRU is good at catching quick patterns, while LSTM handles long-term ones. HD-IDM blends these models using weighted averaging, boosting accuracy, especially with complex patterns. We tested HD-IDM on four datasets: CSE-CIC-IDS2017, CSE-CIC-IDS2018, NSL KDD, and CIC-DDoS2019. The HD-IDM classifier achieved remarkable performance metrics on all datasets. It attains an outstanding accuracy of 99.91%, showcasing its consistent precision across the dataset. With an impressive precision of 99.62%, it excels in accurately categorizing positive cases, crucial for minimizing false positives. Additionally, maintaining a high recall of 99.43%, it effectively identifies the majority of actual positive cases while minimizing false negatives. The F1-score of 99.52% emphasizes its robustness, making it the top choice for classification tasks requiring precision and reliability. It is particularly good at ROC and precision/recall curves, discriminating normal and harmful network activities. While HD-IDM is promising, it has limits. It needs labeled data and may struggle with new intrusion methods. Future work should find ways to handle unlabeled data and adapt to emerging threats. Also, making HD-IDM work faster for real-time use and dealing with scalability challenges is key for its broader use in changing network environments. © 2023 by the authors.
- Authors: Ahmad, Iftikhar , Imran, Muhammad , Qayyum, Abdul , Ramzan, Muhammad , Alassafi, Madini
- Date: 2023
- Type: Text , Journal article
- Relation: Mathematics Vol. 11, no. 21 (2023), p.
- Full Text:
- Reviewed:
- Description: Detecting cyber intrusions in network traffic is a tough task for cybersecurity. Current methods struggle with the complexity of understanding patterns in network data. To solve this, we present the Hybrid Deep Learning Intrusion Detection Model (HD-IDM), a new way that combines GRU and LSTM classifiers. GRU is good at catching quick patterns, while LSTM handles long-term ones. HD-IDM blends these models using weighted averaging, boosting accuracy, especially with complex patterns. We tested HD-IDM on four datasets: CSE-CIC-IDS2017, CSE-CIC-IDS2018, NSL KDD, and CIC-DDoS2019. The HD-IDM classifier achieved remarkable performance metrics on all datasets. It attains an outstanding accuracy of 99.91%, showcasing its consistent precision across the dataset. With an impressive precision of 99.62%, it excels in accurately categorizing positive cases, crucial for minimizing false positives. Additionally, maintaining a high recall of 99.43%, it effectively identifies the majority of actual positive cases while minimizing false negatives. The F1-score of 99.52% emphasizes its robustness, making it the top choice for classification tasks requiring precision and reliability. It is particularly good at ROC and precision/recall curves, discriminating normal and harmful network activities. While HD-IDM is promising, it has limits. It needs labeled data and may struggle with new intrusion methods. Future work should find ways to handle unlabeled data and adapt to emerging threats. Also, making HD-IDM work faster for real-time use and dealing with scalability challenges is key for its broader use in changing network environments. © 2023 by the authors.