An Agile group aware process beyond CRISP-DM: A hospital data mining case study
- Authors: Sharma, Vishakha , Stranieri, Andrew , Ugon, Julien , Martin, Laura
- Date: 2017
- Type: Text , Conference proceedings
- Relation: ICCDA '17: Proceedings of the International Conference on Computer and Data Analysis May 2017 p. 109-113
- Full Text: false
- Reviewed:
- Description: The CRISP-DM methodology is commonly used in data analytics exercises within an organisation to provide system and structure to data mining processes. However, in providing a rigorous framework, CRISP-DM overlooks two facets of data analytics in organisational contexts; data mining exercises are far more agile and subject to change than presumed in CRISP-DM and central decisions regarding the interpretation of patterns discovered and the direction of analytics exercises are typically not made by individuals but by committees or groups within an organisation. The current study provides a case study of data mining in a hospital setting and suggests how the agile nature of an analytics exercise and the group reasoning inherent in key decisions can be accommodated within a CRISP-DM methodology.
Pixel N-grams for mammographic lesion classification
- Authors: Kulkarni, Pradnya , Stranieri, Andrew , Ugon, Julien , Mittal, Manish , Kulkarni, Siddhivinayak
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 2nd International Conference on Communication Systems, Computing and IT Applications, CSCITA , Mumbai; 7th-8th April, 2017; published in CSCITA 2017 - Proceedings p. 107-111
- Full Text: false
- Reviewed:
- Description: Automated classification algorithms have been applied to breast cancer diagnosis in order to improve the diagnostic accuracy and turnover time. However, classification accuracy, sensitivity and specificity could still be improved further. Moreover, reducing computational cost is another challenge as the number of images to be analyzed is typically large. In this paper, a novel Pixel N-gram approach inspired from character N-grams in the text retrieval context has been applied for mammographic lesion classification. The experiments on real world database demonstrate that the Pixel N-grams outperform the existing histogram as well as Haralick features with respect to classification accuracy as well as sensitivity. Effect of varying N and using various classifiers is also analyzed in this paper. Results show that optimum value of N is equal to 3 and MLP classifier performs better than SVM and KNN classifier using 3-gram features.
Texture image classification using pixel N-grams
- Authors: Kulkarni, Pradnya , Stranieri, Andrew , Ugon, Julien
- Date: 2016
- Type: Text , Conference proceedings
- Relation: 2016 IEEE International Conference on Signal and Image Processing (ICSIP); Beijing, China; 13-15 Aug, 2016 p. 137-141
- Full Text: false
- Reviewed:
- Description: Various statistical methods such as co-occurrence matrix, local binary patterns and spectral approaches such as Gabor filters have been used for generating global features for image classification. However, global image features fail to distinguish between local variations within an image. Bag-of-visual-words (BoVW) model do capture local variations in an image, but typically do not consider spatial relationships between the visual words. Here, a novel image representation ‘Pixel N-grams’, inspired from the character N-gram concept in text retrieval has been applied for texture classification purpose. Texture is an important property for image classification. Experiments on the benchmark texture database (UIUC) demonstrates that the overall classification accuracy resulting from Pixel N-gram approach (89.5%) is comparable with that achieved using BoVW approach (84.4%) with the added advantage of simplicity and reduced computational cost.
Analysis and comparison of co-occurrence matrix and pixel n-gram features for mammographic images
- Authors: Kulkarni, Pradnya , Stranieri, Andrew , Kulkarni, Sid , Ugon, Julien , Mittal, Manish
- Date: 2015
- Type: Text , Conference paper
- Relation: International Conference on Communication and Computing p. 7-14
- Full Text: false
- Reviewed:
- Description: Mammography is a proven way of detecting breast cancer at an early stage. Various feature extraction techniques such as histograms, co-occurrence matrix, local binary patterns, Gabor filters, wavelet transforms are used for analysing mammograms. The novel pixel N-gram feature extraction technique has been inspired from the character N-gram concept of text retrieval. In this paper, we have compared the novel N-gram feature extraction technique with the co-occurrence matrix feature extraction technique. The experiments were conducted on the benchmark miniMIAS mammography database. Classification of mammograms into normal and abnormal category using N-gram features showed promising results with greater classification accuracy, sensitivity and specificity compared to classification using co-occurrence matrix features. Moreover, N-gram features computation are found to be considerably faster than co-occurrence matrix feature computation
Patient admission prediction using a pruned fuzzy min-max neural network with rule extraction
- Authors: Wang, Jin , Lim, Cheepeng , Creighton, Douglas , Khorsavi, Abbas , Nahavandi, Saeid , Ugon, Julien , Vamplew, Peter , Stranieri, Andrew , Martin, Laura , Freischmidt, Anton
- Date: 2015
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 26, no. 2 (2015), p. 277-289
- Full Text: false
- Reviewed:
- Description: A useful patient admission prediction model that helps the emergency department of a hospital admit patients efficiently is of great importance. It not only improves the care quality provided by the emergency department but also reduces waiting time of patients. This paper proposes an automatic prediction method for patient admission based on a fuzzy min–max neural network (FMM) with rules extraction. The FMM neural network forms a set of hyperboxes by learning through data samples, and the learned knowledge is used for prediction. In addition to providing predictions, decision rules are extracted from the FMM hyperboxes to provide an explanation for each prediction. In order to simplify the structure of FMM and the decision rules, an optimization method that simultaneously maximizes prediction accuracy and minimizes the number of FMM hyperboxes is proposed. Specifically, a genetic algorithm is formulated to find the optimal configuration of the decision rules. The experimental results using a large data set consisting of 450740 real patient records reveal that the proposed method achieves comparable or even better prediction accuracy than state-of-the-art classifiers with the additional ability to extract a set of explanatory rules to justify its predictions.
Visual character N-grams for classification and retrieval of radiological images
- Authors: Kulkarni, Pradnya , Stranieri, Andrew , Kulkarni, Siddhivinayak , Ugon, Julien , Mittal, Manish
- Date: 2014
- Type: Text , Journal article
- Relation: International Journal of Multimedia & Its Applications Vol. 6, no. 2 (April 2014), p. 35-49
- Full Text:
- Reviewed:
- Description: Diagnostic radiology struggles to maintain high interpretation accuracy. Retrieval of past similar cases would help the inexperienced radiologist in the interpretation process. Character n-gram model has been effective in text retrieval context in languages such as Chinese where there are no clear word boundaries. We propose the use of visual character n-gram model for representation of image for classification and retrieval purposes. Regions of interests in mammographic images are represented with the character n-gram features. These features are then used as input to back-propagation neural network for classification of regions into normal and abnormal categories. Experiments on miniMIAS database show that character n-gram features are useful in classifying the regions into normal and abnormal categories. Promising classification accuracies are observed (83.33%) for fatty background tissue warranting further investigation. We argue that Classifying regions of interests would reduce the number of comparisons necessary for finding similar images from the database and hence would reduce the time required for retrieval of past similar cases.
Automatic sleep stage identification: difficulties and possible solutions
- Authors: Sukhorukova, Nadezda , Stranieri, Andrew , Ofoghi, Bahadorreza , Vamplew, Peter , Saleem, Muhammad Saad , Ma, Liping , Ugon, Adrien , Ugon, Julien , Muecke, Nial , Amiel, Hélène , Philippe, Carole , Bani-Mustafa, Ahmed , Huda, Shamsul , Bertoli, Marcello , Levy, P , Ganascia, J.G
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: The diagnosis of many sleep disorders is a labour intensive task that involves the specialised interpretation of numerous signals including brain wave, breath and heart rate captured in overnight polysomnogram sessions. The automation of diagnoses is challenging for data mining algorithms because the data sets are extremely large and noisy, the signals are complex and specialist's analyses vary. This work reports on the adaptation of approaches from four fields; neural networks, mathematical optimisation, financial forecasting and frequency domain analysis to the problem of automatically determing a patient's stage of sleep. Results, though preliminary, are promising and indicate that combined approaches may prove more fruitful than the reliance on a approach.