- Title
- Effective and efficient itemset pattern summarization: regression-based approaches
- Creator
- Jin, Ruoming; Abu-ata, Muad; Xiang, Yang; Ruan, Ning
- Date
- 2008
- Type
- Text; Conference paper
- Identifier
- http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/76161
- Identifier
- vital:7493
- Identifier
-
https://doi.org/10.1145/1401890.1401941
- Identifier
- ISBN:978-1-60558-193-4
- Abstract
- In this paper, we propose a set of novel regression-based approaches to effectively and efficiently summarize frequent itemset patterns. Specifically, we show that the problem of minimizing the restoration error for a set of itemsets based on a probabilistic model corresponds to a non-linear regression problem. We show that under certain conditions, we can transform the non-linear regression problem to a linear regression problem. We propose two new methods, k-regression and tree-regression, to partition the entire collection of frequent itemsets in order to minimize the restoration error. The K-regression approach, employing a K-means type clustering method, guarantees that the total restoration error achieves a local minimum. The treeregression approach employs a decision-tree type of top-down partition process. In addition, we discuss alternatives to estimate the frequency for the collection of itemsets being covered by the k representative itemsets. The experimental evaluation on both real and synthetic datasets demonstrates that our approaches significantly improve the summarization performance in terms of both accuracy (restoration error), and computational cost.
- Relation
- KDD '08 Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
- Rights
- This metadata is freely available under a CCO license
- Reviewed
- Hits: 1297
- Visitors: 1271
- Downloads: 0
Thumbnail | File | Description | Size | Format |
---|