VQSVM: A case study for incorporating prior domain knowledge into inductive machine learning

Ting Yu, Simeon Simoff, Tony Jan

Research output: Contribution to journalArticlepeer-review

28 Citations (Scopus)


When dealing with real-world problems, there is considerable amount of prior domain knowledge that can provide insights on various aspect of the problem. On the other hand, many machine learning methods rely solely on the data sets for their learning phase and do not take into account any explicitly expressed domain knowledge. This paper proposes a framework that investigates and enables the incorporation of prior domain knowledge with respect to three key characteristics of inductive machine learning algorithms: consistency, generalization and convergence. The framework is used to review, classify and analyse key existing approaches to incorporating domain knowledge into inductive machine learning, as well as to consider the risks of doing so. The paper also demonstrates the design of a novel hierarchical semi-parametric machine learning method, capable of incorporating prior domain knowledge. The method-VQSVM-extends the support vector machine (SVM) family of methods with vector quantization (VQ) techniques to address the problem of learning from imbalanced data sets. The paper presents the results of testing the method on a collection of imbalanced data sets with various imbalance ratios and various numbers of subclasses. The learning process of the VQSVM method utilizes some domain knowledge to solve problem of fitting imbalance data. The experiments in the paper demonstrate that enabling the incorporation of prior domain knowledge into the SVM framework is an effective way to overcome the sensitivity of SVM towards the imbalance ratio in a data set.

Original languageEnglish
Pages (from-to)2614-2623
Number of pages10
Issue number13-15
Publication statusPublished - Aug 2010
Externally publishedYes


  • Imbalance data
  • Inductive machine learning
  • Prior domain knowledge
  • Support vector machine


Dive into the research topics of 'VQSVM: A case study for incorporating prior domain knowledge into inductive machine learning'. Together they form a unique fingerprint.

Cite this