In multimedia applications such as MPEG-4, an efficient model is required to encode and classify video objects such as human, car and building. Recently, Support Vector Machine (SVM) has been shown to be a good classifier; however, its large computational requirement prohibited its use in real time video processing applications. In this paper, a model is proposed that enables use of SVM in video applications. This paper aims to merge multi-scale based selective encoding/classification technique and locality-enhanced Support Vector Machine (SVM). The proposed model allows selected image scales (of interest) to be encoded and classified more accurately by complex classifier such as SVM, whilst other image scales of less significance to be encoded and classified by simpler encoder/classifier. Image scales of interest are readily selected from multi-scale image processing paradigm. SVM is used to encode visual object information of significant image scale only; hence its use is efficient. Experiment with MPEG-4 video object encoding and classification shows that the performance of the proposed model is comparable with other models, however with significantly reduced computational requirements.