A hierarchical VQSVM for imbalanced data sets

Ting Yu, Tony Jan, Simeon Simoff, John Debenham

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Citations (Scopus)

Abstract

First, a hierarchical modelling method, VQSVM, is introduced, and some remarks are discussed. Secondly the proposed VQSVM is applied to a nonstandard learning environment, imbalanced data sets. In cases of extremely imbalanced dataset with high dimensions, standard machine learning techniques tend to be overwhelmed by the large classes. The hierarchical VQSVM contains a set of local models i.e. codevectors produced by the Vector Quantization and a global model, i.e. Support Vector Machine, to rebalance datasets without significant information loss. Some issues, e.g. distortion and support vectors, have been discussed to address the trade-off between the information loss and undersampling rate. Experiments compare VQSVM with random resampling techniques on some imbalanced datasets with varied imbalance ratios, and results show that the performance of VQSVM is superior or equivalent to random resampling techniques, especially in case of extremely imbalanced large datasets.

Original languageEnglish
Title of host publicationThe 2007 International Joint Conference on Neural Networks, IJCNN 2007 Conference Proceedings
Pages518-523
Number of pages6
DOIs
Publication statusPublished - 2007
Externally publishedYes
Event2007 International Joint Conference on Neural Networks, IJCNN 2007 - Orlando, FL, United States
Duration: 12 Aug 200717 Aug 2007

Publication series

NameIEEE International Conference on Neural Networks - Conference Proceedings
ISSN (Print)1098-7576

Conference

Conference2007 International Joint Conference on Neural Networks, IJCNN 2007
Country/TerritoryUnited States
CityOrlando, FL
Period12/08/0717/08/07

Fingerprint

Dive into the research topics of 'A hierarchical VQSVM for imbalanced data sets'. Together they form a unique fingerprint.

Cite this