A comparison of machine learning algorithms for multilabel classification of CAN
- Authors: Kelarev, Andrei , Stranieri, Andrew , Yearwood, John , Jelinek, Herbert
- Date: 2012
- Type: Text , Journal article
- Relation: Advances in Computer Science and Engineering Vol. 9, no. 1 (2012), p. 1-4
- Full Text:
- Reviewed:
- Description: This article is devoted to the investigation and comparison of several important machine learning algorithms in their ability to obtain multilabel classifications of the stages of cardiac autonomic neuropathy (CAN). Data was collected by the Diabetes Complications Screening Research Initiative at Charles Sturt University. Our experiments have achieved better results than those published previously in the literature for similar CAN identification tasks.
A count data model for heart rate variability forecasting and premature ventricular contraction detection
- Authors: Allami, Ragheed , Stranieri, Andrew , Balasubramanian, Venki , Jelinek, Herbert
- Date: 2017
- Type: Text , Journal article
- Relation: Signal Image and Video Processing Vol. 11, no. 8 (2017), p. 1427-1435
- Full Text:
- Reviewed:
- Description: Heart rate variability (HRV) measures including the standard deviation of inter-beat variations (SDNN) require at least 5 min of ECG recordings to accurately measure HRV. In this paper, we predict, using counts data derived from a 3-min ECG recording, the 5-min SDNN and also detect premature ventricular contraction (PVC) beats with a high degree of accuracy. The approach uses counts data combined with a Poisson-generated function that requires minimal computational resources and is well suited to remote patient monitoring with wearable sensors that have limited power, storage and processing capacity. The ease of use and accuracy of the algorithm provide opportunity for accurate assessment of HRV and reduce the time taken to review patients in real time. The PVC beat detection is implemented using the same count data model together with knowledge-based rules derived from clinical knowledge.
Addressing the complexities of big data analytics in healthcare : The diabetes screening case
- Authors: De Silva, Daswin , Burstein, Frada , Jelinek, Herbert , Stranieri, Andrew
- Date: 2015
- Type: Text , Journal article
- Relation: Australasian Journal of Information Systems Vol. 19, no. (2015), p. S99-S115
- Full Text:
- Reviewed:
- Description: The healthcare industry generates a high throughput of medical, clinical and omics data of varying complexity and features. Clinical decision-support is gaining widespread attention as medical institutions and governing bodies turn towards better management of this data for effective and efficient healthcare delivery and quality assured outcomes. Amass of data across all stages, from disease diagnosis to palliative care, is further indication of the opportunities and challenges to effective data management, analysis, prediction and optimization techniques as parts of knowledge management in clinical environments. Big Data analytics (BDA) presents the potential to advance this industry with reforms in clinical decision-support and translational research. However, adoption of big data analytics has been slow due to complexities posed by the nature of healthcare data. The success of these systems is hard to predict, so further research is needed to provide a robust framework to ensure investment in BDA is justified. In this paper we investigate these complexities from the perspective of updated Information Systems (IS) participation theory. We present a case study on a large diabetes screening project to integrate, converge and derive expedient insights from such an accumulation of data and make recommendations for a successful BDA implementation grounded in a participatory framework and the specificities of big data in healthcare context. © 2015 De Silva, Burstein, Jelinek, Stranieri.
An approach for Ewing test selection to support the clinical assessment of cardiac autonomic neuropathy
- Authors: Stranieri, Andrew , Abawajy, Jemal , Kelarev, Andrei , Huda, Shamsul , Chowdhury, Morshed , Jelinek, Herbert
- Date: 2013
- Type: Text , Journal article
- Relation: Artificial Intelligence in Medicine Vol. 58, no. 3 (2013), p. 185-193
- Full Text:
- Reviewed:
- Description: Objective: This article addresses the problem of determining optimal sequences of tests for the clinical assessment of cardiac autonomic neuropathy (CAN) We investigate the accuracy of using only one of the recommended Ewing tests to classify CAN and the additional accuracy obtained by adding the remaining tests of the Ewing battery This is important as not all five Ewing tests can always be applied in each situation in practice Methods and material: We used new and unique database of the diabetes screening research initiative project, which is more than ten times larger than the data set used by Ewing in his original investigation of CAN We utilized decision trees and the optimal decision path finder (ODPF) procedure for identifying optimal sequences of tests Results: We present experimental results on the accuracy of using each one of the recommended Ewing tests to classify CAN and the additional accuracy that can be achieved by adding the remaining tests of the Ewing battery We found the best sequences of tests for cost-function equal to the number of tests The accuracies achieved by the initial segments of the optimal sequences for 2, 3 and 4 categories of CAN are 80.80, 91.33, 93.97 and 94.14, and respectively, 79.86, 89.29, 91.16 and 91.76, and 78.90, 86.21, 88.15 and 88.93 They show significant improvement compared to the sequence considered previously in the literature and the mathematical expectations of the accuracies of a random sequence of tests The complete outcomes obtained for all subsets of the Ewing features are required for determining optimal sequences of tests for any cost-function with the use of the ODPF procedure We have also found two most significant additional features that can increase the accuracy when some of the Ewing attributes cannot be obtained Conclusions: The outcomes obtained can be used to determine the optimal sequences of tests for each individual cost-function by following the ODPF procedure The results show that the best single Ewing test for diagnosing CAN is the deep breathing heart rate variation test Optimal sequences found for the cost-function equal to the number of tests guarantee that the best accuracy is achieved after any number of tests and provide an improvement in comparison with the previous ordering of tests or a random sequence © 2013 Elsevier B.V.
- Description: 2003011130
AWSum - applying data mining in a health care scenario
- Authors: Quinn, Anthony , Jelinek, Herbert , Stranieri, Andrew , Yearwood, John
- Date: 2008
- Type: Text , Conference paper
- Relation: Paper presented at International Conference on Intelligent Sensors, Sensor Networks and Information Processing, ISSNIP 2008, Sydney, New South Wales : 15th-18th December 2008 p. 291-296
- Full Text:
- Description: This paper investigates the application of a new data mining algorithm called Automated Weighted Sum, (AWSum), to diabetes screening data to explore its use in providing researchers with new insight into the disease and secondarily to explore the potential the algorithm has for the generation of prognostic models for clinical use. There are many data mining classifiers that produce high levels of predictive accuracy but their application to health research and clinical applications is limited because they are complex, produce results that are difficult to interpret and are difficult to integrate with current knowledge and practises. This is because most focus on accuracy at the expense of informing the user as to the influences that lead to their classification results. By providing this information on influences a researcher can be pointed to new potentially interesting avenues for investigation. AWSum measures influence by calculating a weight for each feature value that represents its influence on a class value relative to other class values. The results produced, although on limited data, indicated the approach has potential uses for research and has some characteristics that may be useful in the future development of prognostic models.
- Description: 2003006660
Data-analytically derived flexible HbA1c thresholds for type 2 diabetes mellitus diagnostic
- Authors: Stranieri, Andrew , Yatsko, Andrew , Jelinek, Herbert , Venkatraman, Sitalakshmi
- Date: 2015
- Type: Text , Journal article
- Relation: Artificial Intelligence Research Vol. 5, no. 1 (2015), p. 111-134
- Full Text:
- Reviewed:
- Description: Glycated haemoglobin (HbA1c) is now more commonly used as an alternative test to the fasting plasma glucose and oral glucose tolerance tests for the identification of Type 2 Diabetes Mellitus (T2DM) because it is easily obtained using the point-of-care technology and represents long-term blood sugar levels. According to WHO guidelines, HbA1c values of 6.5% or above are required for a diagnosis of T2DM. However outcomes of a large number of trials with HbA1c have been inconsistent across the clinical spectrum and further research is required to determine the efficacy of HbA1c testing in identification of T2DM. Medical records from a diabetes screening program in Australia illustrate that many patients could be classified as diabetics if other clinical indicators are included, even though the HbA1c result does not exceed 6.5%. This suggests that a cutoff for the general population of 6.5% may be too simple and miss individuals at risk or with already overt, undiagnosed diabetes. In this study, data mining algorithms have been applied to identify markers that can be used with HbA1c. The results indicate that T2DM is best classified by HbA1c at 6.2% - a cutoff level lower than the currently recommended one, which can be even less, having assumed the threshold flexibility, if additionally to HbA1c being high the rule is conditioned on oxidative stress or inflammation being present, atherogenicity or adiposity being high, or hypertension being diagnosed, etc.
Diagnostic with incomplete nominal/discrete data
- Authors: Jelinek, Herbert , Yatsko, Andrew , Stranieri, Andrew , Venkatraman, Sitalakshmi , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Artificial Intelligence Research Vol. 4, no. 1 (2015), p. 22-35
- Full Text:
- Reviewed:
- Description: Missing values may be present in data without undermining its use for diagnostic / classification purposes but compromise application of readily available software. Surrogate entries can remedy the situation, although the outcome is generally unknown. Discretization of continuous attributes renders all data nominal and is helpful in dealing with missing values; particularly, no special handling is required for different attribute types. A number of classifiers exist or can be reformulated for this representation. Some classifiers can be reinvented as data completion methods. In this work the Decision Tree, Nearest Neighbour, and Naive Bayesian methods are demonstrated to have the required aptness. An approach is implemented whereby the entered missing values are not necessarily a close match of the true data; however, they intend to cause the least hindrance for classification. The proposed techniques find their application particularly in medical diagnostics. Where clinical data represents a number of related conditions, taking Cartesian product of class values of the underlying sub-problems allows narrowing down of the selection of missing value substitutes. Real-world data examples, some publically available, are enlisted for testing. The proposed and benchmark methods are compared by classifying the data before and after missing value imputation, indicating a significant improvement.
ECG reduction for wearable sensor
- Authors: Allami, Ragheed , Stranieri, Andrew , Balasubramanian, Venki , Jelinek, Herbert
- Date: 2016
- Type: Text , Conference proceedings
- Relation: 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS); Naples, Italy; 28th November-1st December 2016 p. 520-525
- Full Text:
- Reviewed:
- Description: The transmission, storage and analysis of electrocardiogram (ECG) data in real-time is essential for remote patient monitoring with wearable ECG devices and mobile ECG contexts. However, this remains a challenge to achieve within the processing power and the storage capacity of mobile devices. ECG reduction algorithms have an important role to play in reducing the processing requirements for mobile devices, however many existing ECG reduction and compression algorithms are computationally expensive to execute in mobile devices and have not been designed for real-time computation and incremental data arrival. In this paper, we describe a computationally naive, yet effective, algorithm that achieves high ECG reduction rates while maintaining key diagnostic features including PR, QRS, ST, QT and RR intervals. While reduction does not enable ECG waves to be reproduced, the ability to transmit key indicators (diagnostic features) using minimal computational resources, is particularly useful in mobile health contexts involving power constrained sensors and devices. Results of the proposed reduction algorithm indicate that the proposed algorithm outperforms other ECG reduction algorithms at a reduction/compression ratio (CR) of 5:1. If power or processing capacity is low, the algorithm can readily switch to a compression ratio of up to 10: 1 while still maintaining an error rate below 10%.
Emerging point of care devices and artificial intelligence : prospects and challenges for public health
- Authors: Stranieri, Andrew , Venkatraman, Sitalakshmi , Minicz, John , Zarnegar, Armita , Firmin, Sally , Balasubramanian, Venki , Jelinek, Herbert
- Date: 2022
- Type: Text , Journal article
- Relation: Smart Health Vol. 24, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Risk assessments for numerous conditions can now be performed cost-effectively and accurately using emerging point of care devices coupled with machine learning algorithms. In this article, the case is advanced that point of care testing in combination with risk assessments generated with artificial intelligence algorithms, applied to the universal screening of the general public for multiple conditions at one session, represents a new kind of in-expensive screening that can lead to the early detection of disease and other public health benefits. A case study of a diabetes screening clinic in a rural area of Australia is presented to illustrate its benefits. Universal, poly-aetiological screening is shown to meet the ten World Health Organisation criteria for screening programmes. © Elsevier Inc.
Empirical investigation of multi-tier ensembles for the detection of cardiac autonomic neuropathy using subsets of the Ewing features
- Authors: Abawajy, Jemal , Kelarev, Andrei , Stranieri, Andrew , Jelinek, Herbert
- Date: 2012
- Type: Text , Conference proceedings
- Full Text:
- Description: This article is devoted to an empirical investigation of performance of several new large multi-tier ensembles for the detection of cardiac autonomic neuropathy (CAN) in diabetes patients using sub-sets of the Ewing features. We used new data collected by the diabetes screening research initiative (DiScRi) project, which is more than ten times larger than the data set originally used by Ewing in the investigation of CAN. The results show that new multi-tier ensembles achieved better performance compared with the outcomes published in the literature previously. The best accuracy 97.74% of the detection of CAN has been achieved by the novel multi-tier combination of AdaBoost and Bagging, where AdaBoost is used at the top tier and Bagging is used at the middle tier, for the set consisting of the following four Ewing features: the deep breathing heart rate change, the Valsalva manoeuvre heart rate change, the hand grip blood pressure change and the lying to standing blood pressure change.
Exploring novel features and decision rules to identify cardiovascular autonomic neuropathy using a hybrid of wrapper-filter based feature selection
- Authors: Huda, Shamsul , Jelinek, Herbert , Ray, Biplob , Stranieri, Andrew , Yearwood, John
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at the 2010 6th International Conference on Intelligent Sensors, Sensor Networks and Information Processing, ISSNIP 2010 p. 297-302
- Full Text:
- Reviewed:
- Description: Cardiovascular autonomic neuropathy (CAN) is one of the important causes of mortality among diabetes patients. Statistics shows that more than 22% of people with type 2 diabetes mellitus suffer from CAN and which in turn leads to cardiovascular disease (heart attack, stroke). Therefore early detection of CAN could reduce the mortality. Traditional method for detection of CAN uses Ewing's algorithm where five noninvasive cardiovascular tests are used. Often for clinician, it is difficult to collect data from for the Ewing Battery patients due to onerous test conditions. In this paper, we propose a hybrid of wrapper-filter approach to find novel features from patients' ECG records and then generate decision rules for the new features for easier detection of CAN. In the proposed feature selection, a hybrid of filter (Maximum Relevance, MR) and wrapper (Artificial Neural Net Input Gain Measurement Approximation ANNIGMA) approaches (MR-ANNIGMA) would be used. The combined heuristics in the hybrid MRANNIGMA takes the advantages of the complementary properties of the both filter and wrapper heuristics and can find significant features. The selected features set are used to generate a new set of rules for detection of CAN. Experiments on real patient records shows that proposed method finds a smaller set of features for detection of CAN than traditional method which are clinically significant and could lead to an easier way to diagnose CAN. © 2010 IEEE.
Multivariate data-driven decision guidance for clinical scientists
- Authors: Burstein, Frada , De Silva, Daswin , Jelinek, Herbert , Stranieri, Andrew
- Date: 2013
- Type: Text , Conference paper
- Relation: 29th International Conference on Data Engineering Workshops, ICDEW 2013; Proceedings - International Conference on Data Engineering p. 193-199
- Full Text:
- Reviewed:
- Description: Clinical decision-support is gaining widespread attention as medical institutions and governing bodies turn towards utilising better information management for effective and efficient healthcare delivery and quality assured outcomes. Amass of data across all stages, from disease diagnosis to palliative care, is further indication of the opportunities and challenges created for effective data management, analysis, prediction and optimization techniques as parts of knowledge management in clinical environments. A Data-driven Decision Guidance Management System (DD-DGMS) architecture can encompass solutions into a single closed-loop integrated platform to empower clinical scientists to seamlessly explore a multivariate data space in search of novel patterns and correlations to inform their research and practice. The paper describes the components of such an architecture, which includes a robust data warehouse as an infrastructure for comprehensive clinical knowledge management. The proposed DD-DGMS architecture incorporates the dynamic dimensional data model as its elemental core. Given the heterogeneous nature of clinical contexts and corresponding data, the dimensional data model presents itself as an adaptive model that facilitates knowledge discovery, distribution and application, which is essential for clinical decision support. The paper reports on a trial of the DD-DGMS system prototype conducted on diabetes screening data which further establishes the relevance of the proposed architecture to a clinical context.
- Description: E1
Novel data mining techniques for incompleted clinical data in diabetes management
- Authors: Jelinek, Herbert , Yatsko, Andrew , Stranieri, Andrew , Venkatraman, Sitalakshmi
- Date: 2014
- Type: Text , Journal article
- Relation: British Journal of Applied Science & Technology Vol. 4, no. 33 (2014), p. 4591-4606
- Relation: https://doi.org/10.9734/BJAST/2014/11744
- Full Text:
- Reviewed:
- Description: An important part of health care involves upkeep and interpretation of medical databases containing patient records for clinical decision making, diagnosis and follow-up treatment. Missing clinical entries make it difficult to apply data mining algorithms for clinical decision support. This study demonstrates that higher predictive accuracy is possible using conventional data mining algorithms if missing values are dealt with appropriately. We propose a novel algorithm using a convolution of sub-problems to stage a super problem, where classes are defined by Cartesian Product of class values of the underlying problems, and Incomplete Information Dismissal and Data Completion techniques are applied for reducing features and imputing missing values. Predictive accuracies using Decision Branch, Nearest Neighborhood and Naïve Bayesian classifiers were compared to predict diabetes, cardiovascular disease and hypertension. Data is derived from Diabetes Screening Complications Research Initiative (DiScRi) conducted at a regional Australian university involving more than 2400 patient records with more than one hundred clinical risk factors (attributes). The results show substantial improvements in the accuracy achieved with each classifier for an effective diagnosis of diabetes, cardiovascular disease and hypertension as compared to those achieved without substituting missing values. The gain in improvement is 7% for diabetes, 21% for cardiovascular disease and 24% for hypertension, and our integrated novel approach has resulted in more than 90% accuracy for the diagnosis of any of the three conditions. This work advances data mining research towards achieving an integrated and holistic management of diabetes. - See more at: http://www.sciencedomain.org/abstract.php?iid=670&id=5&aid=6128#.VCSxDfmSx8E
Personalised measures of obesity using waist to height ratios from an Australian health screening program
- Authors: Jelinek, Herbert , Stranieri, Andrew , Yatsko, Anderw , Venkatraman, Sitalakshmi
- Date: 2019
- Type: Text , Journal article
- Relation: Digital Health Vol. 5, no. (2019), p. 1-8
- Full Text:
- Reviewed:
- Description: Objectives The aim of the current study is to generate waist circumference to height ratio cut-off values for obesity categories from a model of the relationship between body mass index and waist circumference to height ratio. We compare the waist circumference to height ratio discovered in this way with cut-off values currently prevalent in practice that were originally derived using pragmatic criteria. Method Personalized data including age, gender, height, weight, waist circumference and presence of diabetes, hypertension and cardiovascular disease for 847 participants over eight years were assembled from participants attending a rural Australian health review clinic (DiabHealth). Obesity was classified based on the conventional body mass index measure (weight/height(2)) and compared to the waist circumference to height ratio. Correlations between the measures were evaluated on the screening data, and independently on data from the National Health and Nutrition Examination Survey that included age categories. Results This article recommends waist circumference to height ratio cut-off values based on an Australian rural sample and verified using the National Health and Nutrition Examination Survey database that facilitates the classification of obesity in clinical practice. Gender independent cut-off values are provided for waist circumference to height ratio that identify healthy (waist circumference to height ratio >= 0.45), overweight (0.53) and the three obese (0.60, 0.68, 0.75) categories verified on the National Health and Nutrition Examination Survey dataset. A strong linearity between the waist circumference to height ratio and the body mass index measure is demonstrated. Conclusion The recommended waist circumference to height ratio cut-off values provided a useful index for assessing stages of obesity and risk of chronic disease for improved healthcare in clinical practice.
Predicting cardiac autonomic neuropathy category for diabetic data with missing values
- Authors: Abawajy, Jemal , Kelarev, Andrei , Chowdhury, Morshed , Stranieri, Andrew , Jelinek, Herbert
- Date: 2013
- Type: Text , Journal article
- Relation: Computers in Biology and Medicine Vol. 43, no. 10 (2013), p. 1328-1333
- Full Text:
- Reviewed:
- Description: Cardiovascular autonomic neuropathy (CAN) is a serious and well known complication of diabetes. Previous articles circumvented the problem of missing values in CAN data by deleting all records and fields with missing values and applying classifiers trained on different sets of features that were complete. Most of them also added alternative features to compensate for the deleted ones. Here we introduce and investigate a new method for classifying CAN data with missing values. In contrast to all previous papers, our new method does not delete attributes with missing values, does not use classifiers, and does not add features. Instead it is based on regression and meta-regression combined with the Ewing formula for identifying the classes of CAN. This is the first article using the Ewing formula and regression to classify CAN. We carried out extensive experiments to determine the best combination of regression and meta-regression techniques for classifying CAN data with missing values. The best outcomes have been obtained by the additive regression meta-learner based on M5Rules and combined with the Ewing formula. It has achieved the best accuracy of 99.78% for two classes of CAN, and 98.98% for three classes of CAN. These outcomes are substantially better than previous results obtained in the literature by deleting all missing attributes and applying traditional classifiers to different sets of features without regression. Another advantage of our method is that it does not require practitioners to perform more tests collecting additional alternative features. © 2013 Elsevier Ltd.
- Description: C1
Rule-based classifiers and meta classifiers for identification of cardiac autonomic neuropathy progression
- Authors: Jelinek, Herbert , Kelarev, Andrei , Stranieri, Andrew , Yearwood, John
- Date: 2012
- Type: Text , Journal article
- Relation: International Journal of Information Science and Computer Mathematics Vol. 5, no. 2 (2012), p. 49-53
- Full Text:
- Reviewed:
- Description: We investigate and compare several rule-based classifiers and meta classifiers in their ability to obtain multi-class classifications of cardiac autonomic neuropathy (CAN) and its progression. The best results obtained in our experiments are significantly better than the outcomes published previously in the literature for analogous CAN identification tasks or simpler binary classification tasks.