Application of optimisation-based data mining techniques to tobacco control dataset
- Authors: Dzalilov, Zari , Zhang, J , Bagirov, Adil , Mammadov, Musa
- Date: 2010
- Type: Text , Journal article
- Relation: International Journal of Lean Thinking Vol. 1, no. 1 (2010), p. 27-41
- Full Text: false
- Reviewed:
- Description: Tobacco smoking is one of the leading causes of death around the world. Consequently, control of tobacco use is an important global public health issue. Tobacco control may be aided by development of theoretical and methodological frameworks for describing and understanding complex tobacco control systems. Linear regression and logistic regression are currently very popular statistical techniques for modeling and analyzing complex data in tobacco control systems. However, in tobacco markets, numerous interrelated factors nontrivially interact with tobacco control policies, such that policies and control outcomes are nonlinearly related.
A heuristic algorithm for solving the minimum sum-of-squares clustering problems
- Authors: Ordin, Burak , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Journal of Global Optimization Vol. 61, no. 2 (2015), p. 341-361
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: Clustering is an important task in data mining. It can be formulated as a global optimization problem which is challenging for existing global optimization techniques even in medium size data sets. Various heuristics were developed to solve the clustering problem. The global k-means and modified global k-means are among most efficient heuristics for solving the minimum sum-of-squares clustering problem. However, these algorithms are not always accurate in finding global or near global solutions to the clustering problem. In this paper, we introduce a new algorithm to improve the accuracy of the modified global k-means algorithm in finding global solutions. We use an auxiliary cluster problem to generate a set of initial points and apply the k-means algorithm starting from these points to find the global solution to the clustering problems. Numerical results on 16 real-world data sets clearly demonstrate the superiority of the proposed algorithm over the global and modified global k-means algorithms in finding global solutions to clustering problems.