TY - CHAP
T1 - Combine vector quantization and support vector machine for imbalanced datasets
AU - Yu, Ting
AU - Debenham, John
AU - Jan, Tony
AU - Simoff, Simeon
PY - 2006
Y1 - 2006
N2 - In cases of extremely imbalanced dataset with high dimensions, standard machine learning techniques tend to be overwhelmed by the large classes. This paper rebalances skewed datasets by compressing the majority class. This approach combines Vector Quantization and Support Vector Machine and constructs a new approach, VQ-SVM, to rebalance datasets without significant information loss. Some issues, e.g. distortion and support vectors, have been discussed to address the trade-off between the information loss and undersampling. Experiments compare VQ-SVM and standard SVM on some imbalanced datasets with varied imbalance ratios, and results show that the performance of VQ-SVM is superior to SVM, especially in case of extremely imbalanced large datasets.
AB - In cases of extremely imbalanced dataset with high dimensions, standard machine learning techniques tend to be overwhelmed by the large classes. This paper rebalances skewed datasets by compressing the majority class. This approach combines Vector Quantization and Support Vector Machine and constructs a new approach, VQ-SVM, to rebalance datasets without significant information loss. Some issues, e.g. distortion and support vectors, have been discussed to address the trade-off between the information loss and undersampling. Experiments compare VQ-SVM and standard SVM on some imbalanced datasets with varied imbalance ratios, and results show that the performance of VQ-SVM is superior to SVM, especially in case of extremely imbalanced large datasets.
UR - http://www.scopus.com/inward/record.url?scp=33845530145&partnerID=8YFLogxK
U2 - 10.1007/978-0-387-34747-9_9
DO - 10.1007/978-0-387-34747-9_9
M3 - Chapter
AN - SCOPUS:33845530145
SN - 0387346554
SN - 9780387346557
T3 - IFIP International Federation for Information Processing
SP - 81
EP - 88
BT - Artificial Intelligence in Theory and Practice
A2 - Bramer, Max
ER -