Pixel N-grams for mammographic lesion classification
- Kulkarni, Pradnya, Stranieri, Andrew, Ugon, Julien, Mittal, Manish, Kulkarni, Siddhivinayak
- Authors: Kulkarni, Pradnya , Stranieri, Andrew , Ugon, Julien , Mittal, Manish , Kulkarni, Siddhivinayak
- Date: 2017
- Type: Text , Conference proceedings
- Relation: 2017 2nd International Conference on Communication Systems, Computing and IT Applications, CSCITA , Mumbai; 7th-8th April, 2017; published in CSCITA 2017 - Proceedings p. 107-111
- Full Text: false
- Reviewed:
- Description: Automated classification algorithms have been applied to breast cancer diagnosis in order to improve the diagnostic accuracy and turnover time. However, classification accuracy, sensitivity and specificity could still be improved further. Moreover, reducing computational cost is another challenge as the number of images to be analyzed is typically large. In this paper, a novel Pixel N-gram approach inspired from character N-grams in the text retrieval context has been applied for mammographic lesion classification. The experiments on real world database demonstrate that the Pixel N-grams outperform the existing histogram as well as Haralick features with respect to classification accuracy as well as sensitivity. Effect of varying N and using various classifiers is also analyzed in this paper. Results show that optimum value of N is equal to 3 and MLP classifier performs better than SVM and KNN classifier using 3-gram features.
Diagnostic with incomplete nominal/discrete data
- Jelinek, Herbert, Yatsko, Andrew, Stranieri, Andrew, Venkatraman, Sitalakshmi, Bagirov, Adil
- Authors: Jelinek, Herbert , Yatsko, Andrew , Stranieri, Andrew , Venkatraman, Sitalakshmi , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Artificial Intelligence Research Vol. 4, no. 1 (2015), p. 22-35
- Full Text:
- Reviewed:
- Description: Missing values may be present in data without undermining its use for diagnostic / classification purposes but compromise application of readily available software. Surrogate entries can remedy the situation, although the outcome is generally unknown. Discretization of continuous attributes renders all data nominal and is helpful in dealing with missing values; particularly, no special handling is required for different attribute types. A number of classifiers exist or can be reformulated for this representation. Some classifiers can be reinvented as data completion methods. In this work the Decision Tree, Nearest Neighbour, and Naive Bayesian methods are demonstrated to have the required aptness. An approach is implemented whereby the entered missing values are not necessarily a close match of the true data; however, they intend to cause the least hindrance for classification. The proposed techniques find their application particularly in medical diagnostics. Where clinical data represents a number of related conditions, taking Cartesian product of class values of the underlying sub-problems allows narrowing down of the selection of missing value substitutes. Real-world data examples, some publically available, are enlisted for testing. The proposed and benchmark methods are compared by classifying the data before and after missing value imputation, indicating a significant improvement.
- Authors: Jelinek, Herbert , Yatsko, Andrew , Stranieri, Andrew , Venkatraman, Sitalakshmi , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Artificial Intelligence Research Vol. 4, no. 1 (2015), p. 22-35
- Full Text:
- Reviewed:
- Description: Missing values may be present in data without undermining its use for diagnostic / classification purposes but compromise application of readily available software. Surrogate entries can remedy the situation, although the outcome is generally unknown. Discretization of continuous attributes renders all data nominal and is helpful in dealing with missing values; particularly, no special handling is required for different attribute types. A number of classifiers exist or can be reformulated for this representation. Some classifiers can be reinvented as data completion methods. In this work the Decision Tree, Nearest Neighbour, and Naive Bayesian methods are demonstrated to have the required aptness. An approach is implemented whereby the entered missing values are not necessarily a close match of the true data; however, they intend to cause the least hindrance for classification. The proposed techniques find their application particularly in medical diagnostics. Where clinical data represents a number of related conditions, taking Cartesian product of class values of the underlying sub-problems allows narrowing down of the selection of missing value substitutes. Real-world data examples, some publically available, are enlisted for testing. The proposed and benchmark methods are compared by classifying the data before and after missing value imputation, indicating a significant improvement.
Visual character N-grams for classification and retrieval of radiological images
- Kulkarni, Pradnya, Stranieri, Andrew, Kulkarni, Siddhivinayak, Ugon, Julien, Mittal, Manish
- Authors: Kulkarni, Pradnya , Stranieri, Andrew , Kulkarni, Siddhivinayak , Ugon, Julien , Mittal, Manish
- Date: 2014
- Type: Text , Journal article
- Relation: International Journal of Multimedia & Its Applications Vol. 6, no. 2 (April 2014), p. 35-49
- Full Text:
- Reviewed:
- Description: Diagnostic radiology struggles to maintain high interpretation accuracy. Retrieval of past similar cases would help the inexperienced radiologist in the interpretation process. Character n-gram model has been effective in text retrieval context in languages such as Chinese where there are no clear word boundaries. We propose the use of visual character n-gram model for representation of image for classification and retrieval purposes. Regions of interests in mammographic images are represented with the character n-gram features. These features are then used as input to back-propagation neural network for classification of regions into normal and abnormal categories. Experiments on miniMIAS database show that character n-gram features are useful in classifying the regions into normal and abnormal categories. Promising classification accuracies are observed (83.33%) for fatty background tissue warranting further investigation. We argue that Classifying regions of interests would reduce the number of comparisons necessary for finding similar images from the database and hence would reduce the time required for retrieval of past similar cases.
- Authors: Kulkarni, Pradnya , Stranieri, Andrew , Kulkarni, Siddhivinayak , Ugon, Julien , Mittal, Manish
- Date: 2014
- Type: Text , Journal article
- Relation: International Journal of Multimedia & Its Applications Vol. 6, no. 2 (April 2014), p. 35-49
- Full Text:
- Reviewed:
- Description: Diagnostic radiology struggles to maintain high interpretation accuracy. Retrieval of past similar cases would help the inexperienced radiologist in the interpretation process. Character n-gram model has been effective in text retrieval context in languages such as Chinese where there are no clear word boundaries. We propose the use of visual character n-gram model for representation of image for classification and retrieval purposes. Regions of interests in mammographic images are represented with the character n-gram features. These features are then used as input to back-propagation neural network for classification of regions into normal and abnormal categories. Experiments on miniMIAS database show that character n-gram features are useful in classifying the regions into normal and abnormal categories. Promising classification accuracies are observed (83.33%) for fatty background tissue warranting further investigation. We argue that Classifying regions of interests would reduce the number of comparisons necessary for finding similar images from the database and hence would reduce the time required for retrieval of past similar cases.
Rule-based classifiers and meta classifiers for identification of cardiac autonomic neuropathy progression
- Jelinek, Herbert, Kelarev, Andrei, Stranieri, Andrew, Yearwood, John
- Authors: Jelinek, Herbert , Kelarev, Andrei , Stranieri, Andrew , Yearwood, John
- Date: 2012
- Type: Text , Journal article
- Relation: International Journal of Information Science and Computer Mathematics Vol. 5, no. 2 (2012), p. 49-53
- Full Text:
- Reviewed:
- Description: We investigate and compare several rule-based classifiers and meta classifiers in their ability to obtain multi-class classifications of cardiac autonomic neuropathy (CAN) and its progression. The best results obtained in our experiments are significantly better than the outcomes published previously in the literature for analogous CAN identification tasks or simpler binary classification tasks.
- Authors: Jelinek, Herbert , Kelarev, Andrei , Stranieri, Andrew , Yearwood, John
- Date: 2012
- Type: Text , Journal article
- Relation: International Journal of Information Science and Computer Mathematics Vol. 5, no. 2 (2012), p. 49-53
- Full Text:
- Reviewed:
- Description: We investigate and compare several rule-based classifiers and meta classifiers in their ability to obtain multi-class classifications of cardiac autonomic neuropathy (CAN) and its progression. The best results obtained in our experiments are significantly better than the outcomes published previously in the literature for analogous CAN identification tasks or simpler binary classification tasks.
Feature selection using misclassification counts
- Bagirov, Adil, Yatsko, Andrew, Stranieri, Andrew
- Authors: Bagirov, Adil , Yatsko, Andrew , Stranieri, Andrew
- Date: 2011
- Type: Conference proceedings , Unpublished work
- Relation: Proceedings of the 9th Australasian Data Mining Conference (AusDM 2011), 51-62. Conferences in Research and Practice in Information Technology (CRPIT), Vol. 121.
- Full Text:
- Description: Dimensionality reduction of the problem space through detection and removal of variables, contributing little or not at all to classification, is able to relieve the computational load and instance acquisition effort, considering all the data attributes accessed each time around. The approach to feature selection in this paper is based on the concept of coherent accumulation of data about class centers with respect to coordinates of informative features. Ranking is done on the degree to which different variables exhibit random characteristics. The results are being verified using the Nearest Neighbor classifier. This also helps to address the feature irrelevance and redundancy, what ranking does not immediately decide. Additionally, feature ranking methods from different independent sources are called in for the direct comparison.
- Description: Dimensionality reduction of the problem space through detection and removal of variables, contributing little or not at all to classification, is able to relieve the computational load and the data acquisition effort, considering all data components being accessed each time around. The approach to feature selection in this paper is based on the concept of coherent accumulation of data about class centers with respect to coordinates of informative features. Ranking is done on the degree, to which different variables exhibit random characteristics. The results are being verified using the Nearest Neighbor classifier. This also helps to address the feature irrelevance, what ranking does not immediately decide. Additionally, feature ranking methods available from different independent sources are called in for direct comparison.
- Authors: Bagirov, Adil , Yatsko, Andrew , Stranieri, Andrew
- Date: 2011
- Type: Conference proceedings , Unpublished work
- Relation: Proceedings of the 9th Australasian Data Mining Conference (AusDM 2011), 51-62. Conferences in Research and Practice in Information Technology (CRPIT), Vol. 121.
- Full Text:
- Description: Dimensionality reduction of the problem space through detection and removal of variables, contributing little or not at all to classification, is able to relieve the computational load and instance acquisition effort, considering all the data attributes accessed each time around. The approach to feature selection in this paper is based on the concept of coherent accumulation of data about class centers with respect to coordinates of informative features. Ranking is done on the degree to which different variables exhibit random characteristics. The results are being verified using the Nearest Neighbor classifier. This also helps to address the feature irrelevance and redundancy, what ranking does not immediately decide. Additionally, feature ranking methods from different independent sources are called in for the direct comparison.
- Description: Dimensionality reduction of the problem space through detection and removal of variables, contributing little or not at all to classification, is able to relieve the computational load and the data acquisition effort, considering all data components being accessed each time around. The approach to feature selection in this paper is based on the concept of coherent accumulation of data about class centers with respect to coordinates of informative features. Ranking is done on the degree, to which different variables exhibit random characteristics. The results are being verified using the Nearest Neighbor classifier. This also helps to address the feature irrelevance, what ranking does not immediately decide. Additionally, feature ranking methods available from different independent sources are called in for direct comparison.
A classification algorithm that derives weighted sum scores for insight into disease
- Quinn, Anthony, Stranieri, Andrew, Yearwood, John, Hafen, Gaudenz
- Authors: Quinn, Anthony , Stranieri, Andrew , Yearwood, John , Hafen, Gaudenz
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at Third Australasian Workshop on Health Informatics and Knowledge Management (HIKM 2009), Wellington, New Zealand : Vol. 97, p. 13-17
- Full Text:
- Description: Data mining is often performed with datasets associated with diseases in order to increase insights that can ultimately lead to improved prevention or treatment. Classification algorithms can achieve high levels of predictive accuracy but have limited application for facilitating the insight that leads to deeper understanding of aspects of the disease. This is because the representation of knowledge that arises from classification algorithms is too opaque, too complex or too sparse to facilitate insight. Clustering, association and visualisation approaches enable greater scope for clinicians to be engaged in a way that leads to insight, however predictive accuracy is compromised or non-existent. This research investigates the practical applications of Automated Weighted Sum, (AWSum), a classification algorithm that provides accuracy comparable to other techniques whilst providing some insight into the data. This is achieved by calculating a weight for each feature value that represents its influence on the class value. Clinicians are very familiar with weighted sum scoring scales so the internal representation is intuitive and easily understood. This paper presents results from the use of the AWSum approach with data from patients suffering from Cystic Fibrosis.
- Authors: Quinn, Anthony , Stranieri, Andrew , Yearwood, John , Hafen, Gaudenz
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at Third Australasian Workshop on Health Informatics and Knowledge Management (HIKM 2009), Wellington, New Zealand : Vol. 97, p. 13-17
- Full Text:
- Description: Data mining is often performed with datasets associated with diseases in order to increase insights that can ultimately lead to improved prevention or treatment. Classification algorithms can achieve high levels of predictive accuracy but have limited application for facilitating the insight that leads to deeper understanding of aspects of the disease. This is because the representation of knowledge that arises from classification algorithms is too opaque, too complex or too sparse to facilitate insight. Clustering, association and visualisation approaches enable greater scope for clinicians to be engaged in a way that leads to insight, however predictive accuracy is compromised or non-existent. This research investigates the practical applications of Automated Weighted Sum, (AWSum), a classification algorithm that provides accuracy comparable to other techniques whilst providing some insight into the data. This is achieved by calculating a weight for each feature value that represents its influence on the class value. Clinicians are very familiar with weighted sum scoring scales so the internal representation is intuitive and easily understood. This paper presents results from the use of the AWSum approach with data from patients suffering from Cystic Fibrosis.
AWSum -Combining classification with knowledge acquisition
- Quinn, Anthony, Stranieri, Andrew, Yearwood, John, Hafen, Gaudenz, Jelinek, Herbert
- Authors: Quinn, Anthony , Stranieri, Andrew , Yearwood, John , Hafen, Gaudenz , Jelinek, Herbert
- Date: 2008
- Type: Text , Journal article
- Relation: International Journal of Software and Informatics Vol. 2, no. 2 (2008), p. 199-214
- Full Text: false
- Reviewed:
- Description: Many classifiers achieve high levels of accuracy but have limited applicability in real world situations because they do not lead to a greater understanding or insight into the way features influence the classification. In areas such as health informatics a classifier that clearly identifies the influences on classification can be used to direct research and formulate interventions. This research investigates the practical aplications of Automated Weighted Sum, (AWSum), a classifier that provides accuracy comparable to other techniques whist providing insight into the data. This is achieved by calculating a weight for each feature value that represents its influence on the class value. The merits of this approach in classification and insight are evaluated on a Cystic Fibrosis and diabetes datasets with positive results.
- «
- ‹
- 1
- ›
- »