DSpace

King Saud University Repository >
King Saud University >
COLLEGES >
Science Colleges >
College of Computer and Information Sciences >
College of Computer and Information Sciences >

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/15264

Title: An efficient stream mining technique
Authors: Hatim A. Aboalsamh
Alaaeldin M. Hafez
Ghazy M. R. Assassa
Issue Date: 2008
Abstract: Stream analysis is considered as a crucial component of strategic control over a broad variety of disciplines in business, science and engineering. Stream data is a sequence of observations collected over intervals of time. Each data stream describes a phenomenon. Analysis on Stream data includes discovering trends (or patterns) in a Stream sequence. In the last few years, data mining has emerged and been recognized as a new technology for data analysis. Data Mining is the process of discovering potentially valuable patterns, associations, trends, sequences and dependencies in data. Data mining techniques can discover information that many traditional business analysis and statistical techniques fail to deliver. In our study, we emphasis on the use of data mining techniques on data streams, where mining techniques and tools are used in an attempt to recognize, anticipate and learn the stream behavior with different directly related or looked unrelated factors. Targeted data are sequences of observations collected over intervals of time. Each sequence describes a phenomenon or a factor. Such factors could have either a direct or indirect impact on the stream data under study. Examples of factors with direct impact include the yearly budgets and expenditures, taxations, local stocks prices, unemployment rates, inflation rates, fallen angels, and rising odds for upgrades. Indirect factors could include any phenomena in the local or global environments, such as, global stocks prices, education expenditures, weather conditions, employment strategies, and medical services. Analysis on data includes discovering trends (or patterns) and association between sequences in order to generate non-trivial knowledge. In this paper, we propose a data mining technique to predict the dependency between factors that affect performance. The proposed technique consists of three phases: (a) for each data sequence that represents a chosen phenomenon, generate its trend sequences, (b) discover maximal frequent trend patterns, generate pattern vectors (to keep information of frequent trend patterns), use trend pattern vectors to predict future factor sequences.
URI: http://hdl.handle.net/123456789/15264
Appears in Collections:College of Computer and Information Sciences

Files in This Item:

File Description SizeFormat
DrHatim_Journ_1.docx13.02 kBMicrosoft Word XMLView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

DSpace Software Copyright © 2002-2007 MIT and Hewlett-Packard - Feedback