Unsupervised classifier selection based on two-sample test
Research output: Contribution to journal › Article › Scientific › peer-review
|Journal||Lecture Notes in Computer Science|
|Publication status||Published - 2008|
|Publication type||A1 Journal article-refereed|
We propose a well-founded method of ranking a pool of m trained classifiers by their suitability for the current input of n instances. It can be used when dynamically selecting a single classifier as well as in weighting the base classifiers in an ensemble. No classifiers are executed during the process. Thus, the n instances, based on which we select the classifier, can as well be unlabeled. This is rare in previous work. The method works by comparing the training distributions of classifiers with the input distribution. Hence, the feasibility for unsupervised classification comes with a price of maintaining a small sample of the training data for each classifier in the pool.
In the general case our method takes time O (m(t + n)2) and space O(mt + n), where t is the size of the stored sample from the training distribution for each classifier. However, for commonly used Gaussian and polynomial kernel functions we can execute the method more efficiently. In our experiments the proposed method was found to be accurate.