Seed miner - a scalable data mining framework

Loading...
Thumbnail Image

Date

2010-09

Journal Title

Journal ISSN

Volume Title

Publisher

Computer Science & Engineering Society c/o Department of Computer Science and Engineering, University of Moratuwa.

Abstract

Through this paper we consider how the representation, access and organization of the data drastically affect the performance of Data Mining Techniques. The framework we propose utilizes vertical data representation which is an emerging data representation technique, combined with couple of compression schemes to facilitate efficient data mining, scaling over large datasets. The key aspect of using a compression scheme in SEED Miner lies in its vertical data representation (where a column-based data representation is considered in contrast to the conventional horizontal rowbased representation) and we also provide the results of empirical simulations to validate our analysis of WAH compression applied on top of vertical data would provide the scalability and efficiency of the applications and algorithms embedded in SEED Miner.

Description

Keywords

Citation

******

DOI