Multi-Classifier Systems for Effective Mining From Noisy Data Streams
|
Dynamic
Classifier Selection for Effective Mining from Noisy Data Streams Recently, mining from data streams has become an important and challenging task for many real-world applications such as credit card fraud protection and sensor networking. One popular solution is to separate stream data into chunks, learn a base classifier from each chunk, and then integrate all base classifiers for effective classification. In this paper, we propose a new dynamic classifier selection (DCS) mechanism to integrate base classifiers for effective mining from data streams. The proposed algorithm dynamically selects a single "best" classifier to classify each test instance at run time. Our scheme uses statistical information from attribute values, and uses each attribute to partition the evaluation set into disjoint subsets, followed by a procedure that evaluates the classification accuracy of each base classifier on these subsets. Given a test instance, its attribute values determine the subsets that the similar instances in the evaluation set have constructed, and the classifier with the highest classification accuracy on those subsets is selected to classify the test instance. Experimental results and comparative studies demonstrate the efficiency and efficacy of our method. Such a DCS scheme appears to be promising in mining data streams with dramatic concept drifting or with a significant amount of noise, where the base classifiers are likely conflictive or have low confidence. In this section, we present a new dynamic classifier selection scheme,
called AO-DCS (Attribute-Oriented Dynamic Classifier Selection). We use
attribute values of instances to partition the evaluation set into subsets
for evaluation purpose. If the instances in a dataset have only one attribute,
and we use this attribute to partition the evaluation set into disjoint
subsets with each subset corresponding to one value of this attribute,
the classification accuracy of each base classifier with these subsets
is the performance of the classifier with the instances characterized
by each attribute value. Then each base classifier's performance will
reflect its domain of expertise. Our AO-DCS takes the following three
steps.
Fig.2. Evaluating each base classifier on constructed subsets
Fig. 3. Dynamic classifier selection of AO-DCS
In many applications, data is not static but arrives in data streams,
and the stream data is also characterized by drifting concepts. In other
words, the underlying data generation models, or the concepts that we
try to learn from the data, are constantly evolving, even dramatically.
Meanwhile, real-world stream data is never perfect and can often suffer
from corruptions (noise) that may impact interpretations of the data,
models created from the data and decisions made based on the data. Due
to the inherent huge volume of the size, it is actually hard to apply
general data cleansing mechanisms to stream data for better data quality.
Therefore, the above two facts (concept drifting and noise) imply that
the classifiers learned from a small portion of the stream may vary significantly
in performance, which makes DCS a promising solution for effective mining
from a real-world data stream. In this section, we propose a framework
to apply AO-DCS to mine noisy data streams. This framework is general
enough to allow any exiting learning algorithms to be incorporated to
handle any real-world data streams. As shown in Fig. 4, we first partition streaming data into a series of
chunks, S1, S2, .. Si,.., each of which is small enough to be processed
by an induction algorithm at one time. Then we learn a base classifier
Ci from each chunk Si. To evaluate all base classifiers (in the case that
the number of base classifiers is too large, we can keep only the most
recent K classifiers) and determine the "best" one for each
test instance, we will dynamically construct an evaluation set Z (using
the most recent instances, because they are likely consistent with the
current test instances). When classifying a test instance, Ik, we will
employ AO-DCS (and the evaluation set Z) to integrate existing base classifiers
and select the "best" classifier to classify Ik. In real-world situations, many data streams contain a certain level of
noise. There are two common types of noise: attribute noise and class
noise. In this paper, we assume the data stream suffers from a certain
level of class noise (which means the errors are introduced in the class
labels), and will extensively evaluate the performance of different DCS
schemes in handling noisy data, where various levels of class noise are
manually introduced (before the data partitioning) to simulate real-world
scenarios. This should provide interested readers with valuable knowledge
about mining from real-world data streams.
Figure 4. Applying DCS in noisy data streams
|