Using meta-regression data mining to improve predictions of performance based on heart rate dynamics for Australian football
- Authors: Jelinek, Herbert , Kelarev, Andrei , Robinson, Dean , Stranieri, Andrew , Cornforth, David
- Date: 2014
- Type: Text , Journal article
- Relation: Applied Soft Computing Vol. 14, no. PART A (2014), p. 81-87
- Full Text: false
- Reviewed:
- Description: This work investigates the effectiveness of using computer-based machine learning regression algorithms and meta-regression methods to predict performance data for Australian football players based on parameters collected during daily physiological tests. Three experiments are described. The first uses all available data with a variety of regression techniques. The second uses a subset of features selected from the available data using the Random Forest method. The third used meta-regression with the selected feature subset. Our experiments demonstrate that feature selection and meta-regression methods improve the accuracy of predictions for match performance of Australian football players based on daily data of medical tests, compared to regression methods alone. Meta-regression methods and feature selection were able to obtain performance prediction outcomes with significant correlation coefficients. The best results were obtained by the additive regression based on isotonic regression for a set of most influential features selected by Random Forest. This model was able to predict athlete performance data with a correlation coefficient of 0.86 (p < 0.05). © 2013 Published by Elsevier B.V. All rights reserved.
- Description: C1
Diagnostic with incomplete nominal/discrete data
- Authors: Jelinek, Herbert , Yatsko, Andrew , Stranieri, Andrew , Venkatraman, Sitalakshmi , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Artificial Intelligence Research Vol. 4, no. 1 (2015), p. 22-35
- Full Text:
- Reviewed:
- Description: Missing values may be present in data without undermining its use for diagnostic / classification purposes but compromise application of readily available software. Surrogate entries can remedy the situation, although the outcome is generally unknown. Discretization of continuous attributes renders all data nominal and is helpful in dealing with missing values; particularly, no special handling is required for different attribute types. A number of classifiers exist or can be reformulated for this representation. Some classifiers can be reinvented as data completion methods. In this work the Decision Tree, Nearest Neighbour, and Naive Bayesian methods are demonstrated to have the required aptness. An approach is implemented whereby the entered missing values are not necessarily a close match of the true data; however, they intend to cause the least hindrance for classification. The proposed techniques find their application particularly in medical diagnostics. Where clinical data represents a number of related conditions, taking Cartesian product of class values of the underlying sub-problems allows narrowing down of the selection of missing value substitutes. Real-world data examples, some publically available, are enlisted for testing. The proposed and benchmark methods are compared by classifying the data before and after missing value imputation, indicating a significant improvement.
A count data model for heart rate variability forecasting and premature ventricular contraction detection
- Authors: Allami, Ragheed , Stranieri, Andrew , Balasubramanian, Venki , Jelinek, Herbert
- Date: 2017
- Type: Text , Journal article
- Relation: Signal Image and Video Processing Vol. 11, no. 8 (2017), p. 1427-1435
- Full Text:
- Reviewed:
- Description: Heart rate variability (HRV) measures including the standard deviation of inter-beat variations (SDNN) require at least 5 min of ECG recordings to accurately measure HRV. In this paper, we predict, using counts data derived from a 3-min ECG recording, the 5-min SDNN and also detect premature ventricular contraction (PVC) beats with a high degree of accuracy. The approach uses counts data combined with a Poisson-generated function that requires minimal computational resources and is well suited to remote patient monitoring with wearable sensors that have limited power, storage and processing capacity. The ease of use and accuracy of the algorithm provide opportunity for accurate assessment of HRV and reduce the time taken to review patients in real time. The PVC beat detection is implemented using the same count data model together with knowledge-based rules derived from clinical knowledge.
Data-analytically derived flexible HbA1c thresholds for type 2 diabetes mellitus diagnostic
- Authors: Stranieri, Andrew , Yatsko, Andrew , Jelinek, Herbert , Venkatraman, Sitalakshmi
- Date: 2015
- Type: Text , Journal article
- Relation: Artificial Intelligence Research Vol. 5, no. 1 (2015), p. 111-134
- Full Text:
- Reviewed:
- Description: Glycated haemoglobin (HbA1c) is now more commonly used as an alternative test to the fasting plasma glucose and oral glucose tolerance tests for the identification of Type 2 Diabetes Mellitus (T2DM) because it is easily obtained using the point-of-care technology and represents long-term blood sugar levels. According to WHO guidelines, HbA1c values of 6.5% or above are required for a diagnosis of T2DM. However outcomes of a large number of trials with HbA1c have been inconsistent across the clinical spectrum and further research is required to determine the efficacy of HbA1c testing in identification of T2DM. Medical records from a diabetes screening program in Australia illustrate that many patients could be classified as diabetics if other clinical indicators are included, even though the HbA1c result does not exceed 6.5%. This suggests that a cutoff for the general population of 6.5% may be too simple and miss individuals at risk or with already overt, undiagnosed diabetes. In this study, data mining algorithms have been applied to identify markers that can be used with HbA1c. The results indicate that T2DM is best classified by HbA1c at 6.2% - a cutoff level lower than the currently recommended one, which can be even less, having assumed the threshold flexibility, if additionally to HbA1c being high the rule is conditioned on oxidative stress or inflammation being present, atherogenicity or adiposity being high, or hypertension being diagnosed, etc.