- Title
- Consensus clustering and supervised classification for profiling phishing emails in internet commerce security
- Creator
- Dazeley, Richard; Yearwood, John; Kang, Byeongho; Kelarev, Andrei
- Date
- 2010
- Type
- Text; Conference paper
- Identifier
- http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/64665
- Identifier
- vital:3805
- Identifier
- ISBN:0302-9743 (ISSN)
- Abstract
- This article investigates internet commerce security applications of a novel combined method, which uses unsupervised consensus clustering algorithms in combination with supervised classification methods. First, a variety of independent clustering algorithms are applied to a randomized sample of data. Second, several consensus functions and sophisticated algorithms are used to combine these independent clusterings into one final consensus clustering. Third, the consensus clustering of the randomized sample is used as a training set to train several fast supervised classification algorithms. Finally, these fast classification algorithms are used to classify the whole large data set. One of the advantages of this approach is in its ability to facilitate the inclusion of contributions from domain experts in order to adjust the training set created by consensus clustering. We apply this approach to profiling phishing emails selected from a very large data set supplied by the industry partners of the Centre for Informatics and Applied Optimization. Our experiments compare the performance of several classification algorithms incorporated in this scheme. © 2010 Springer-Verlag Berlin Heidelberg.
- Publisher
- Daegu Springer-Verlag
- Relation
- Paper presented at 11th International Workshop on Knowledge Management and Acquisition for Smart Systems and Services, PKAW 2010 Vol. 6232 LNAI, p. 235-246
- Rights
- Copyright Springer
- Rights
- Open Access
- Rights
- This metadata is freely available under a CCO license
- Subject
- Classification algorithm; Clusterings; Combined method; Consensus clustering; Consensus functions; Domain experts; Fast classification; Informatics; Large data; Phishing; Security application; Supervised classification; Training sets; Very large datum; Classification (of information); Electronic commerce; Electronic mail; Internet; Knowledge management; Management science; Clustering algorithms
- Full Text
- Reviewed
- Hits: 15416
- Visitors: 15649
- Downloads: 566
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details Download | SOURCE1 | Submitted version | 249 KB | Adobe Acrobat PDF | View Details Download |