D-ECG: A Dynamic framework for cardiac Arrhythmia detection from IoT-Based ECGs
- Authors: He, Jinyuan , Rong, Jia , Sun, Le , Wang, Hua , Zhang, Yanchun , Ma, Jiangang
- Date: 2018
- Type: Text , Book chapter
- Relation: Web Information Systems Engineering – WISE 2018 Chapter 6 p. 85-99
- Full Text: false
- Reviewed:
- Description: Cardiac arrhythmia has been identified as a type of cardiovascular diseases (CVDs) that causes approximately 12%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$12\%$$\end{document} of all deaths globally. The current progress on arrhythmia detection based on ECG recordings is facing a bottleneck for adopting single classifier and static ensemble methods. Besides, most of the work tend to use a static feature set for characterizing all types of heartbeats, which may limit the classification performance. To fill in the gap, a novel framework called D-ECG is proposed to introduce dynamic ensemble selection (DES) technique to provide accurate detection of cardiac arrhythmia. In addition, the proposed D-ECG develops a result regulator that use different features to refine the classification result from the DES technique. The results reported in this paper have shown visible improvement on the overall heartbeat classification accuracy as well as the sensitivity of disease heartbeats.
A framework for cardiac arrhythmia detection from IoT-based ECGs
- Authors: He, Jinyuan , Rong, Jia , Sun, Le , Wang, Hua , Zhang, Yanchun , Ma, Jiangang
- Date: 2020
- Type: Text , Journal article
- Relation: World Wide Web Vol. 23, no. 5 (2020), p. 2835-2850
- Full Text:
- Reviewed:
- Description: Cardiac arrhythmia has been identified as a type of cardiovascular diseases (CVDs) that causes approximately 12% of all deaths globally. The development of Internet-of-Things has spawned novel ways for heart monitoring but also presented new challenges for manual arrhythmia detection. An automated method is highly demanded to provide support for physicians. Current attempts for automatic arrhythmia detection can roughly be divided as feature-engineering based and deep-learning based methods. Most of the feature-engineering based methods are suffering from adopting single classifier and use fixed features for classifying all five types of heartbeats. This introduces difficulties in identification of the problematic heartbeats and limits the overall classification performance. The deep-learning based methods are usually not evaluated in a realistic manner and report overoptimistic results which may hide potential limitations of the models. Moreover, the lack of consideration of frequency patterns and the heart rhythms can also limit the model performance. To fill in the gaps, we propose a framework for arrhythmia detection from IoT-based ECGs. The framework consists of two modules: a data cleaning module and a heartbeat classification module. Specifically, we propose two solutions for the heartbeat classification task, namely Dynamic Heartbeat Classification with Adjusted Features (DHCAF) and Multi-channel Heartbeat Convolution Neural Network (MCHCNN). DHCAF is a feature-engineering based approach, in which we introduce dynamic ensemble selection (DES) technique and develop a result regulator to improve classification performance. MCHCNN is deep-learning based solution that performs multi-channel convolutions to capture both temporal and frequency patterns from heartbeat to assist the classification. We evaluate the proposed framework with DHCAF and with MCHCNN on the well-known MIT-BIH-AR database, respectively. The results reported in this paper have proven the effectiveness of our framework. © 2020, Springer Science+Business Media, LLC, part of Springer Nature.
PU-shapelets : Towards pattern-based positive unlabeled classification of time series
- Authors: Liang, Shen , Zhang, Yanchun , Ma, Jiangang
- Date: 2019
- Type: Text , Conference proceedings , Conference paper
- Relation: 24th International Conference on Database Systems for Advanced Applications, DASFAA 2019; Chiang Mai, Thailand; 22nd-25th April 2019; part of the Lecture Notes in Computer Science book series, also part of the Information Systems and Applications, incl. Internet/Web and HCI sub series Vol. 11446 LNCS, p. 87-103
- Full Text:
- Reviewed:
- Description: Real-world time series classification applications often involve positive unlabeled (PU) training data, where there are only a small set PL of positive labeled examples and a large set U of unlabeled ones. Most existing time series PU classification methods utilize all readings in the time series, making them sensitive to non-characteristic readings. Characteristic patterns named shapelets present a promising solution to this problem, yet discovering shapelets under PU settings is not easy. In this paper, we take on the challenging task of shapelet discovery with PU data. We propose a novel pattern ensemble technique utilizing both characteristic and non-characteristic patterns to rank U examples by their possibilities of being positive. We also present a novel stopping criterion to estimate the number of positive examples in U. These enable us to effectively label all U training examples and conduct supervised shapelet discovery. The shapelets are then used to build a one-nearest-neighbor classifier for online classification. Extensive experiments demonstrate the effectiveness of our method.
- Description: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Enhancing linear time complexity time series classification with hybrid bag-of-patterns
- Authors: Liang, Shen , Zhang, Yanchun , Ma, Jiangang
- Date: 2020
- Type: Text , Conference paper
- Relation: 25th International Conference on Database Systems for Advanced Applications, DASFAA 2020 Vol. 12112 LNCS, p. 717-735
- Full Text: false
- Reviewed:
- Description: In time series classification, one of the most popular models is Bag-Of-Patterns (BOP). Most BOP methods run in super-linear time. A recent work proposed a linear time BOP model, yet it has limited accuracy. In this work, we present Hybrid Bag-Of-Patterns (HBOP), which can greatly enhance accuracy while maintaining linear complexity. Concretely, we first propose a novel time series discretization method called SLA, which can retain more information than the classic SAX. We use a hybrid of SLA and SAX to expressively and compactly represent subsequences, which is our most important design feature. Moreover, we develop an efficient time series transformation method that is key to achieving linear complexity. We also propose a novel X-means clustering subroutine to handle subclasses. Extensive experiments on over 100 datasets demonstrate the effectiveness and efficiency of our method. © 2020, Springer Nature Switzerland AG.
Active model selection for positive unlabeled time series classification
- Authors: Liang, Shen , Zhang, Yanchun , Ma, Jiangang
- Date: 2020
- Type: Text , Conference proceedings , Conference paper
- Relation: 36th IEEE International Conference on Data Engineering, ICDE 2020 Vol. 2020-April, p. 361-372
- Full Text: false
- Reviewed:
- Description: Positive unlabeled time series classification (PUTSC) refers to classifying time series with a set PL of positive labeled examples and a set U of unlabeled ones. Model selection for PUTSC is a largely untouched topic. In this paper, we look into PUTSC model selection, which as far as we know is the first systematic study in this topic. Focusing on the widely adopted self-training one-nearest-neighbor (ST-1NN) paradigm, we propose a model selection framework based on active learning (AL). We present the novel concepts of self-training label propagation, pseudo label calibration principles and ultimately influence to fully exploit the mechanism of ST-1NN. Based on them, we develop an effective model performance evaluation strategy and three AL sampling strategies. Experiments on over 120 datasets and a case study in arrhythmia detection show that our methods can yield top performance in interactive environments, and can achieve near optimal results by querying very limited numbers of labels from the AL oracle. © 2020 IEEE.
- Description: E1
Supervised anomaly detection in uncertain pseudoperiodic data streams
- Authors: Ma, Jiangang , Sun, Le , Wang, Hua , Zhang, Yanchun , Aickelin, Uwe
- Date: 2016
- Type: Text , Journal article
- Relation: ACM transactions on Internet technology Vol. 16, no. 1 (2016), p. 1-20
- Full Text: false
- Reviewed:
- Description: Uncertain data streams have been widely generated in many Web applications. The uncertainty in data streams makes anomaly detection from sensor data streams far more challenging. In this article, we present a novel framework that supports anomaly detection in uncertain data streams. The proposed framework adopts the wavelet soft-thresholding method to remove the noises or errors in data streams. Based on the refined data streams, we develop effective period pattern recognition and feature extraction techniques to improve the computational efficiency. We use classification methods for anomaly detection in the corrected data stream. We also empirically show that the proposed approach shows a high accuracy of anomaly detection on several real datasets.
Bilateral insider threat detection : harnessing standalone and sequential activities with recurrent neural networks
- Authors: Manoharan, Phavithra , Hong, Wei , Yin, Jiao , Zhang, Yanchun , Ye, Wenjie , Ma, Jiangang
- Date: 2023
- Type: Text , Conference paper
- Relation: 24th International Conference on Web Information Systems Engineering, WISE 2023, Melbourne, 25-27 October 2023, Web Information Systems Engineering – WISE 2023, 24th International Conference, Melbourne, VIC, Australia, October 25–27, 2023, Proceedings Vol. 14306 LNCS, p. 179-188
- Full Text: false
- Reviewed:
- Description: Insider threats involving authorised individuals exploiting their access privileges within an organisation can yield substantial damage compared to external threats. Conventional detection approaches analyse user behaviours from logs, using binary classifiers to distinguish between malicious and non-malicious users. However, existing methods focus solely on standalone or sequential activities. To enhance the detection of malicious insiders, we propose a novel approach: bilateral insider threat detection combining RNNs to incorporate standalone and sequential activities. Initially, we extract behavioural traits from log files representing standalone activities. Subsequently, RNN models capture features of sequential activities. Concatenating these features, we employ binary classification to detect insider threats effectively. Experiments on the CERT 4.2 dataset showcase the approach’s superiority, significantly enhancing insider threat detection using features from both standalone and sequential activities. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Enhancing dynamic ECG heartbeat classification with lightweight transformer model
- Authors: Meng, Lingxiao , Tan, Wenjun , Ma, Jiangang , Wang, Ruofei , Yin, Xiaoxia , Zhang, Yanchun
- Date: 2022
- Type: Text , Journal article
- Relation: Artificial Intelligence in Medicine Vol. 124, no. (2022), p.
- Full Text: false
- Reviewed:
- Description: Arrhythmia is a common class of Cardiovascular disease which is the cause for over 31% of all death over the world, according to WHOs' report. Automatic detection and classification of arrhythmia, as an effective tool of early warning, has recently been received more and more attention, especially in the applications of wearable devices for data capturing. However, different from traditional application scenarios, wearable electrocardiogram (ECG) devices have some drawbacks, such as being subject to multiple abnormal interferences, thus making accurate ventricular contraction (PVC) and supraventricular premature beat (SPB) detection to be more challenging. The traditional models for heartbeat classification suffer from the problem of large-scale parameters and the performance in dynamic ECG heartbeat classification is not satisfactory. In this paper, we propose a novel light model Lightweight Fussing Transformer to address these problems. We developed a more lightweight structure named LightConv Attention (LCA) to replace the self-attention of Fussing Transformer. LCA has reached remarkable performance level equal to or higher than self-attention with fewer parameters. In particular, we designed a stronger embedding structure (Convolutional Neural Network with attention mechanism) to enhance the weight of features of internal morphology of the heartbeat. Furthermore, we have implemented the proposed methods on real datasets and experimental results have demonstrated outstanding accuracy of detecting PVC and SPB. © 2022 Elsevier B.V.
Discovering regularities from traditional chinese medicine prescriptions via bipartite embedding model
- Authors: Ruan, Chunyang , Ma, Jiangang , Wang, Ye , Zhang, Yanchun , Yang, Yun
- Date: 2019
- Type: Text , Conference proceedings
- Relation: International Joint Conferences on Artificial Intelligence (IJCAI-49); Macao, China; 10th-16th August 2019; published in Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) p. 3346-3352
- Full Text: false
- Reviewed:
- Description: Regularities analysis for prescriptions is a significant task for traditional Chinese medicine (TCM), both in inheritance of clinical experience and in improvement of clinical quality. Recently, many methods have been proposed for regularities discovery, but this task is challenging due to the quantity, sparsity and free-style of prescriptions. In this paper, we address the specific problem of regularities discovery and propose a graph embedding based framework for regularities discovery for massive prescriptions. We model this task as a relation prediction in which the correlation of two herbs or of herb and symptom are incorporated to characterize the different relationships. Specifically, we first establish a heterogeneous network with herbs and symptoms as its nodes. We develop a bipartite embedding model termed HS2Vec to detect regularities, which explores multiple relations of herbherb, and herb-symptom based on the heterogeneous network. Experiments on four real-world datasets demonstrate that the proposed framework is very effective for regularities discovery.
Adversarial heterogeneous network embedding with metapath attention mechanism
- Authors: Ruan, Chunyang , Wang, Ye , Ma, Jiangang , Zhang, Yanchun , Chen, Xintian
- Date: 2019
- Type: Text , Journal article
- Relation: Journal of Computer Science and Technology Vol. 34, no. 6 (2019), p. 1217-1229
- Full Text: false
- Reviewed:
- Description: Heterogeneous information network (HIN)-structured data provide an effective model for practical purposes in real world. Network embedding is fundamental for supporting the network-based analysis and prediction tasks. Methods of network embedding that are currently popular normally fail to effectively preserve the semantics of HIN. In this study, we propose AGA2Vec, a generative adversarial model for HIN embedding that uses attention mechanisms and meta-paths. To capture the semantic information from multi-typed entities and relations in HIN, we develop a weighted meta-path strategy to preserve the proximity of HIN. We then use an autoencoder and a generative adversarial model to obtain robust representations of HIN. The results of experiments on several real-world datasets show that the proposed approach outperforms state-of-the-art approaches for HIN embedding. © 2019, Springer Science+Business Media, LLC & Science Press, China.
THCluster: herb supplements categorization for precision traditional Chinese medicine
- Authors: Ruan, Chunyang , Wang, Ye , Zhang, Yanchun , Ma, Jiangang , Chen, Huijuan , Aickelin, Uwe , Zhu, Shanfeng , Zhang, Ting
- Date: 2020
- Type: Text , Conference proceedings
- Relation: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM);Kansas City, MO, USA; 13-16 Nov. 2017 p. 417-424
- Full Text: false
- Reviewed:
- Description: There has been a continuing demand for traditional and complementary medicine worldwide. A fundamental and important topic in Traditional Chinese Medicine (TCM) is to optimize the prescription and to detect herb regularities from TCM data. In this paper, we propose a novel clustering model to solve this general problem of herb categorization, a pivotal task of prescription optimization and herb regularities. The model utilizes Random Walks method, Bayesian rules and Expectation Maximization(EM) models to complete a clustering analysis effectively on a heterogeneous information network. We performed extensive experiments on the real-world datasets and compared our method with other algorithms and experts. Experimental results have demonstrated the effectiveness of the proposed model for discovering useful categorization of herbs and its potential clinical manifestations.
Image preprocessing in classification and identification of diabetic eye diseases
- Authors: Sarki, Rubina , Ahmed, Khandakar , Wang, Hua , Zhang, Yanchun , Ma, Jiangang , Wang, Kate
- Date: 2021
- Type: Text , Journal article
- Relation: Data Science and Engineering Vol. 6, no. 4 (2021), p. 455-471
- Full Text:
- Reviewed:
- Description: Diabetic eye disease (DED) is a cluster of eye problem that affects diabetic patients. Identifying DED is a crucial activity in retinal fundus images because early diagnosis and treatment can eventually minimize the risk of visual impairment. The retinal fundus image plays a significant role in early DED classification and identification. An accurate diagnostic model’s development using a retinal fundus image depends highly on image quality and quantity. This paper presents a methodical study on the significance of image processing for DED classification. The proposed automated classification framework for DED was achieved in several steps: image quality enhancement, image segmentation (region of interest), image augmentation (geometric transformation), and classification. The optimal results were obtained using traditional image processing methods with a new build convolution neural network (CNN) architecture. The new built CNN combined with the traditional image processing approach presented the best performance with accuracy for DED classification problems. The results of the experiments conducted showed adequate accuracy, specificity, and sensitivity. © 2021, The Author(s).
Cloud service description model : an extension of USDL for cloud services
- Authors: Sun, Le , Ma, Jiangan , Wang, Hua , Zhang, Yanchun , Yong, Jianming
- Date: 2018
- Type: Text , Journal article
- Relation: IEEE Transactions on Services Computing Vol. 11, no. 2 (2018), p. 354-368
- Full Text: false
- Reviewed:
- Description: There are a variety of well-designed specification-modelling-languages serving Internet services, however, none of them is capable of describing the special features of cloud services, from both technical and business points of view. The Unified Service Description Language (USDL) provides a new way to describe Internet services from business, operational, and technical perspectives. Nevertheless, there are various issues with USDL: it lacks a comprehensive specification model, particularly for cloud services, lacks a user-centric specification modeling paradigm, lacks a mechanism to measure cloud service attributes and to present the association relationship and re-usability of the attributes, and lacks semantic representation of cloud services. Based on the above issues, we propose a unified semantic Cloud Service Description Model (CSDM) in this paper. The proposed model will be extended from the basic structure of USDL, by defining cloud-service-specific attributes. Furthermore, an additional module, named transaction module, will be defined, which models the rating system of cloud services from several aspects, such as risk, trust, and reputation. The transaction module facilitates the capability of CSDM with regard to service ranking, and enhances its flexibility and extensibility by providing an extensible sub-module. In addition, we design an OWL-based annotation system to enrich the semantic expressivity of this model. Finally, a case study is provided to explain the application of this model in actual cloud services. © 2008-2012 IEEE.
Exploring data mining techniques in medical data streams
- Authors: Sun, Le , Ma, Jiangang , Zhang, Yanchun , Wang, Hua
- Date: 2016
- Type: Text , Book chapter
- Relation: Databases Theory and Applications Chapter 25 p. 321-332
- Full Text: false
- Reviewed:
- Description: Data stream mining has been studied in diverse application domains. In recent years, a population aging is stressing the national and international health care systems. Anomaly detection is a typical example of a data streams application. It is a dynamic process of finding abnormal behaviours from given data streams. In this paper, we discuss the existing anomaly detection techniques for Medical data streams. In addition, we present a process of using the Autoregressive Integrated Moving Average model (ARIMA) to analyse the ECG data streams.
A weighted overlook graph representation of eeg data for absence epilepsy detection
- Authors: Wang, Jialin , Liang, Shen , Wang, Ye , Zhang, Yanchun , Ma, Jiangang
- Date: 2020
- Type: Text , Conference proceedings , Conference paper
- Relation: 20th IEEE International Conference on Data Mining, ICDM 2020 Vol. 2020-November, p. 581-590
- Full Text: false
- Reviewed:
- Description: Absence epilepsy is one of the most common types of epilepsy. The diagnosis of absence epilepsy is among the greatest challenges faced by clinical neurologists due to a lack of easily observable symptoms that are present in conventional epilepsy (e.g. spasm and convulsion), and highly relies on the detection of Spike and Slow Waves (SSWs) in Electroencephalogram (EEG) signals. Recently, graph representations called complex networks have been increasingly applied to characterizing 1D EEG signals. However, existing methods often fail to effectively represent SSWs, struggling to capture the differences between SSW waveforms and their non-SSW counterparts, such as minute differences and distinct shapes. Addressing this issue, in this work, we propose two simple yet effective complex networks, Overlook Graph (OG) and Weighted Overlook Graph (WOG), which have been customized to expressively represent SSWs. Built upon OG and WOG, we then develop a 2D Convolutional Neural Network (2D-CNN) to further learn latent features from the graph representations and accomplish the detection task. Extensive experiments on a real-world absence epilepsy EEG dataset show that the proposed OG/WOG-2D-CNN method can accurately detect SSWs. Additional experiments on the well-known Bonn dataset further show that our method can generalize to the conventional epilepsy seizure detection task with highly competitive performances. © 2020 IEEE. *Please note that there are multiple authors for this article therefore only the name of the first 5 including Federation University Australia affiliate "Jiangang Ma“ is provided in this record**