Optimization based clustering and classification algorithms in analysis of microarray gene expression data sets
- Authors: Mardaneh, Karim
- Date: 2007
- Type: Text , Thesis , PhD
- Full Text:
- Description: Doctor of Philosophy
- Description: Bioinformatics and computational biology are relatively new areas that involve the use of different techniques including computer science, informatics, biochemistry, applied math and etc., to solve biological problems. In recent years the development of new molecular genetics technologies, such as DNA microarrays led to the simultaneous measurement of expression levels of thousands and even tens of thousands of genes. Microarray gene expression technology has facilitated the study of genomic structure and investigation of biological systems. Numerical output of this technology is shown as microarray gene expression data sets. These data sets contain a very large number of genes and a relatively small number of samples and their precise analysis requires a robust and suitable computer software. Due to this, only a few existing algorithms are applicable to them, so more efficient methods for solving clustering, gene selection and classification problems of gene expression data sets are required and those methods need to be computationally applicable and less expensive. The aim of this thesis is to develop new algorithms for solving clustering, gene selection and data classification problems on gene expression data sets. Clustering in gene expression data sets is a challenging problem. The increasing use of DNA microarray-based tumour gene expression profiles for cancer diagnosis requires more efficient methods to solve clustering problems of these profiles. Different algorithms for clustering of genes have been proposed, however few algorithms can be applied to the clustering of samples. k-means algorithm, among very few clustering algorithms is applicable to microarray gene expression data sets, however these are not efficient for solving clustering problems when the number of genes is thousands and this algorithm is very sensitive to the choice of a starting point. Additionally, when the number of clusters is relatively large, this algorithm gives local minima which can differ significantly from the global solution. Over the last several years different approaches have been proposed to improve global ii Abstract Abstract search properties of k-means algorithm. One of them is the global k-means algorithm, however this algorithm is not efficient when data are sparse. In this thesis we developed a new version of the global k-means algorithm, the modified global k-means algorithm which is effective for solving clustering problems in gene expression data sets. In a microarray gene expression data set, in many cases only a small fraction of genes are informative whereas most of them are non-informative and make noise. Therefore the development of gene selection algorithms that allow us to remove as many non-informative genes as possible is very important. In this thesis we developed a new overlapping gene selection algorithm. This algorithm is based on calculating overlaps of different genes. It considerably reduces the number of genes and is efficient in finding a subset of informative genes. Over the last decade different approaches have been proposed to solve supervised data classification problems in gene expression data sets. In this thesis we developed a new approach which is based on the so-called max-min separability and is compared with the other approaches. The max-min separability algorithm is an equivalent of piecewise linear separability. An incremental algorithm is presented to compute piecewise linear functions separating two sets. This algorithm is applied along with a special gene selection algorithm. In this thesis, all new algorithms have been tested on 10 publicly available gene expression data sets and our numerical results demonstrate the efficiency of the new algorithms that were developed in the framework of this research
- Authors: Mardaneh, Karim
- Date: 2007
- Type: Text , Thesis , PhD
- Full Text:
- Description: Doctor of Philosophy
- Description: Bioinformatics and computational biology are relatively new areas that involve the use of different techniques including computer science, informatics, biochemistry, applied math and etc., to solve biological problems. In recent years the development of new molecular genetics technologies, such as DNA microarrays led to the simultaneous measurement of expression levels of thousands and even tens of thousands of genes. Microarray gene expression technology has facilitated the study of genomic structure and investigation of biological systems. Numerical output of this technology is shown as microarray gene expression data sets. These data sets contain a very large number of genes and a relatively small number of samples and their precise analysis requires a robust and suitable computer software. Due to this, only a few existing algorithms are applicable to them, so more efficient methods for solving clustering, gene selection and classification problems of gene expression data sets are required and those methods need to be computationally applicable and less expensive. The aim of this thesis is to develop new algorithms for solving clustering, gene selection and data classification problems on gene expression data sets. Clustering in gene expression data sets is a challenging problem. The increasing use of DNA microarray-based tumour gene expression profiles for cancer diagnosis requires more efficient methods to solve clustering problems of these profiles. Different algorithms for clustering of genes have been proposed, however few algorithms can be applied to the clustering of samples. k-means algorithm, among very few clustering algorithms is applicable to microarray gene expression data sets, however these are not efficient for solving clustering problems when the number of genes is thousands and this algorithm is very sensitive to the choice of a starting point. Additionally, when the number of clusters is relatively large, this algorithm gives local minima which can differ significantly from the global solution. Over the last several years different approaches have been proposed to improve global ii Abstract Abstract search properties of k-means algorithm. One of them is the global k-means algorithm, however this algorithm is not efficient when data are sparse. In this thesis we developed a new version of the global k-means algorithm, the modified global k-means algorithm which is effective for solving clustering problems in gene expression data sets. In a microarray gene expression data set, in many cases only a small fraction of genes are informative whereas most of them are non-informative and make noise. Therefore the development of gene selection algorithms that allow us to remove as many non-informative genes as possible is very important. In this thesis we developed a new overlapping gene selection algorithm. This algorithm is based on calculating overlaps of different genes. It considerably reduces the number of genes and is efficient in finding a subset of informative genes. Over the last decade different approaches have been proposed to solve supervised data classification problems in gene expression data sets. In this thesis we developed a new approach which is based on the so-called max-min separability and is compared with the other approaches. The max-min separability algorithm is an equivalent of piecewise linear separability. An incremental algorithm is presented to compute piecewise linear functions separating two sets. This algorithm is applied along with a special gene selection algorithm. In this thesis, all new algorithms have been tested on 10 publicly available gene expression data sets and our numerical results demonstrate the efficiency of the new algorithms that were developed in the framework of this research
Modified global k-means algorithm for clustering in gene expression data sets
- Bagirov, Adil, Mardaneh, Karim
- Authors: Bagirov, Adil , Mardaneh, Karim
- Date: 2006
- Type: Text , Conference paper
- Relation: Paper presented at Intelligent Systems for Bioinformatics 2006, proceedings of the AI 2006 Workshop on Intelligent Systems of Bioinformatics, Hobart, Tasmania : 4th December, 2006
- Full Text:
- Reviewed:
- Description: Clustering in gene expression data sets is a challenging problem. Different algorithms for clustering of genes have been proposed. However due to the large number of genes only a few algorithms can be applied for the clustering of samples. k-means algorithm and its different variations are among those algorithms. But these algorithms in general can converge only to local minima and these local minima are significantly different from global solutions as the number of clusters increases. Over the last several years different approaches have been proposed to improve global search properties of k-means algorithm and its performance on large data sets. One of them is the global k-means algorithm. In this paper we develop a new version of the global k-means algorithm: the modified global k-means algorithm which is effective for solving clustering problems in gene expression data sets. We present preliminary computational results using gene expression data sets which demonstrate that the modified k-means algorithm improves and sometimes significantly results by k-means and global k-means algorithms.
- Description: E1
- Description: 2003001713
- Authors: Bagirov, Adil , Mardaneh, Karim
- Date: 2006
- Type: Text , Conference paper
- Relation: Paper presented at Intelligent Systems for Bioinformatics 2006, proceedings of the AI 2006 Workshop on Intelligent Systems of Bioinformatics, Hobart, Tasmania : 4th December, 2006
- Full Text:
- Reviewed:
- Description: Clustering in gene expression data sets is a challenging problem. Different algorithms for clustering of genes have been proposed. However due to the large number of genes only a few algorithms can be applied for the clustering of samples. k-means algorithm and its different variations are among those algorithms. But these algorithms in general can converge only to local minima and these local minima are significantly different from global solutions as the number of clusters increases. Over the last several years different approaches have been proposed to improve global search properties of k-means algorithm and its performance on large data sets. One of them is the global k-means algorithm. In this paper we develop a new version of the global k-means algorithm: the modified global k-means algorithm which is effective for solving clustering problems in gene expression data sets. We present preliminary computational results using gene expression data sets which demonstrate that the modified k-means algorithm improves and sometimes significantly results by k-means and global k-means algorithms.
- Description: E1
- Description: 2003001713
Small-to-medium enterprises and economic growth : A comparative study of clustering techniques
- Authors: Mardaneh, Karim
- Date: 2012
- Type: Text , Journal article
- Relation: Journal of Modern Applied Statistical Methods Vol. 11, no. 2 (2012), p. 469-478
- Full Text:
- Reviewed:
- Description: Small-to-medium enterprises (SMEs) in regional (non-metropolitan) areas are considered when economic planning may require large data sets and sophisticated clustering techniques. The economic growth of regional areas was investigated using four clustering algorithms. Empirical analysis demonstrated that the modified global k-means algorithm outperformed other algorithms. © 2012 JMASM, Inc.
- Description: 2003010429
- Authors: Mardaneh, Karim
- Date: 2012
- Type: Text , Journal article
- Relation: Journal of Modern Applied Statistical Methods Vol. 11, no. 2 (2012), p. 469-478
- Full Text:
- Reviewed:
- Description: Small-to-medium enterprises (SMEs) in regional (non-metropolitan) areas are considered when economic planning may require large data sets and sophisticated clustering techniques. The economic growth of regional areas was investigated using four clustering algorithms. Empirical analysis demonstrated that the modified global k-means algorithm outperformed other algorithms. © 2012 JMASM, Inc.
- Description: 2003010429
Industry type and business size on economic growth: Comparing Australia's Regional and Metropolitan areas
- Authors: Mardaneh, Karim
- Date: 2011
- Type: Text , Conference proceedings
- Relation: 56th Annual ICSB World Conference; Back to the Future - Changes in Perspectives of Global Entrepreneurship and Innovation,Stockholm, Sweden, 15-18 June, 2011
- Full Text:
- Reviewed:
- Description: While the main body of literature regarding small-to-medium enterprises is focused on formation and growth, there is insufficient research about the role of both (a) firm size and (b) location on economic growth. The role of firm size and industrial structure on economic growth has been examined by some researchers. Pagano (2003) and Pagano and Schivardi (2000) identified a positive association between average firm size and growth and Carree and Thurik (1999) found evidence that the low number of large firms in an industry could lead to a higher value added growth. The current study attempts to investigate the impact of industry structure and businesses operating within these industries on economic growth. This paper uses “k-means” clustering algorithm to cluster Statistical Local Areas. Regression analysis is utilised to identify drivers of economic growth. Preliminary results suggest that size of business may act as a driver of economic growth but the impact could vary based on location.
- Authors: Mardaneh, Karim
- Date: 2011
- Type: Text , Conference proceedings
- Relation: 56th Annual ICSB World Conference; Back to the Future - Changes in Perspectives of Global Entrepreneurship and Innovation,Stockholm, Sweden, 15-18 June, 2011
- Full Text:
- Reviewed:
- Description: While the main body of literature regarding small-to-medium enterprises is focused on formation and growth, there is insufficient research about the role of both (a) firm size and (b) location on economic growth. The role of firm size and industrial structure on economic growth has been examined by some researchers. Pagano (2003) and Pagano and Schivardi (2000) identified a positive association between average firm size and growth and Carree and Thurik (1999) found evidence that the low number of large firms in an industry could lead to a higher value added growth. The current study attempts to investigate the impact of industry structure and businesses operating within these industries on economic growth. This paper uses “k-means” clustering algorithm to cluster Statistical Local Areas. Regression analysis is utilised to identify drivers of economic growth. Preliminary results suggest that size of business may act as a driver of economic growth but the impact could vary based on location.
Small businesses, Institutions, and Regional Incomes
- Mardaneh, Karim, O'Malley, Tony
- Authors: Mardaneh, Karim , O'Malley, Tony
- Date: 2014
- Type: Text , Conference proceedings
- Relation: 59th ISCB World Conference, Entrepreneurship and sustainability, Dublin, 11th June, 2014
- Full Text:
- Reviewed:
- Description: Regional small businesses may rely on customers who earn income in local and global markets. Small business must transact with suppliers of knowledge and resources, transform those resources into innovative and saleable products or services, and transact with customers. Transformation, transaction and social activities, and the institutions which support them, are necessary for successful small businesses. Regional income and small businesses depend on innovation and trade provided by social and transaction institutions. In this paper we demonstrate this proposition empirically using a model and by investigating the relationship between regional income, transaction institutions, transformation institutions, and social institutions for 140 functional economic regions (FERs) in Australia. The model suggests that social institutions create a macro-environment in which transaction institutions and the transformation and trading activities of businesses can thrive, and help to generate regional income and prosperity. We follow others (Cooke et al., 2007) in arguing that strong transaction institutions are a necessary condition for regional innovation. Social institutions complement transaction institutions by providing education and training, arts and recreation, health care and social services. In the studies reported in this paper the capacity for search and intermediation of exchanges of all kinds (goods, services, knowledge etc.) is measured by the share of transaction institutions in regional employment. The capacity of social institutions is measured by the share of employment in social institutions. We argue that the market failures which cause regional failures to thrive may be made solvable by mobilising market making services to extend and provide governance for regional transactions with faraway markets.
- Authors: Mardaneh, Karim , O'Malley, Tony
- Date: 2014
- Type: Text , Conference proceedings
- Relation: 59th ISCB World Conference, Entrepreneurship and sustainability, Dublin, 11th June, 2014
- Full Text:
- Reviewed:
- Description: Regional small businesses may rely on customers who earn income in local and global markets. Small business must transact with suppliers of knowledge and resources, transform those resources into innovative and saleable products or services, and transact with customers. Transformation, transaction and social activities, and the institutions which support them, are necessary for successful small businesses. Regional income and small businesses depend on innovation and trade provided by social and transaction institutions. In this paper we demonstrate this proposition empirically using a model and by investigating the relationship between regional income, transaction institutions, transformation institutions, and social institutions for 140 functional economic regions (FERs) in Australia. The model suggests that social institutions create a macro-environment in which transaction institutions and the transformation and trading activities of businesses can thrive, and help to generate regional income and prosperity. We follow others (Cooke et al., 2007) in arguing that strong transaction institutions are a necessary condition for regional innovation. Social institutions complement transaction institutions by providing education and training, arts and recreation, health care and social services. In the studies reported in this paper the capacity for search and intermediation of exchanges of all kinds (goods, services, knowledge etc.) is measured by the share of transaction institutions in regional employment. The capacity of social institutions is measured by the share of employment in social institutions. We argue that the market failures which cause regional failures to thrive may be made solvable by mobilising market making services to extend and provide governance for regional transactions with faraway markets.
Economic resilience of regions under crises : A study of the Australian economy
- Courvisanos, Jerry, Jain, Ameeta, Mardaneh, Karim
- Authors: Courvisanos, Jerry , Jain, Ameeta , Mardaneh, Karim
- Date: 2016
- Type: Text , Journal article
- Relation: Regional Studies Vol. 50, no. 4 (2016), p. 629-643
- Full Text:
- Reviewed:
- Description: Economic resilience of regions under crises: a study of the Australian economy, Regional Studies. Identifying patterns of economic resilience in regions by industry categories is the focus of this paper. Patterns emerge from adaptive capacity in four distinct functional groups of local government regions in Australia, in respect of their resilience from shocks on specific industries. A model of regional adaptive cycles around four sequential phases - reorganization, exploitation, conservation and release - is adopted as the framework for recognizing such patterns. A data-mining method utilizes a k-means algorithm to evaluate the impact of two major shocks - a 13-year drought and the Global Financial Crisis - on four functional groups of regions, using census data from 2001, 2006 and 2011. © 2015 Regional Studies Association.
- Authors: Courvisanos, Jerry , Jain, Ameeta , Mardaneh, Karim
- Date: 2016
- Type: Text , Journal article
- Relation: Regional Studies Vol. 50, no. 4 (2016), p. 629-643
- Full Text:
- Reviewed:
- Description: Economic resilience of regions under crises: a study of the Australian economy, Regional Studies. Identifying patterns of economic resilience in regions by industry categories is the focus of this paper. Patterns emerge from adaptive capacity in four distinct functional groups of local government regions in Australia, in respect of their resilience from shocks on specific industries. A model of regional adaptive cycles around four sequential phases - reorganization, exploitation, conservation and release - is adopted as the framework for recognizing such patterns. A data-mining method utilizes a k-means algorithm to evaluate the impact of two major shocks - a 13-year drought and the Global Financial Crisis - on four functional groups of regions, using census data from 2001, 2006 and 2011. © 2015 Regional Studies Association.
- «
- ‹
- 1
- ›
- »