TUTCRIS - Tampereen teknillinen yliopisto

TUTCRIS

An Optimized k-NN Approach for Classification on Imbalanced Datasets with Missing Data

Tutkimustuotosvertaisarvioitu

Yksityiskohdat

AlkuperäiskieliEnglanti
OtsikkoAdvances in Intelligent Data Analysis XV
Alaotsikko15th International Symposium, IDA 2016, Stockholm, Sweden, October 13-15, 2016, Proceedings
KustantajaSpringer
Sivut387-392
ISBN (elektroninen)978-3-319-46349-0
ISBN (painettu)978-3-319-46348-3
DOI - pysyväislinkit
TilaJulkaistu - 2016
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaINTERNATIONAL SYMPOSIUM ON INTELLIGENT DATA ANALYSIS -
Kesto: 1 tammikuuta 1900 → …

Julkaisusarja

NimiLecture Notes in Computer Science
Vuosikerta9897
ISSN (painettu)0302-9743

Conference

ConferenceINTERNATIONAL SYMPOSIUM ON INTELLIGENT DATA ANALYSIS
Ajanjakso1/01/00 → …

Tiivistelmä

In this paper, we describe our solution for the machine learning prediction challenge in IDA 2016. For the given problem of 2-class classification on an imbalanced dataset with missing data, we first develop an imputation method based on k-NN to estimate the missing values. Then we define a tailored representation for the given problem as an optimization scheme, which consists of learned distance and voting weights for k-NN classification. The proposed solution performs better in terms of the given challenge metric compared to the traditional classification methods such as SVM, AdaBoost or Random Forests.

Julkaisufoorumi-taso