Bag-of-Visual Words for word-wise video script identification: A study

Nabin Sharma, Ranju Mandal, Rabi Sharma, Umapada Pal, Michael Blumenstein

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)


Use of multiple scripts for information communication through various media is quite common in a multilingual country. Optical character recognition of such document images or videos assists in indexing them for effective information retrieval. Hence, script identification from multi-lingual documents/images is a necessary step for selecting the appropriate OCR, due the absence of a single OCR system capable of handling multiple scripts. Script identification from printed as well as handwritten documents is a well-researched area, but script identification from video frames has not been explored much. Low resolution, blur, noisy background, to mention a few are the major bottle necks when processing video frames, and makes script identification from video images a challenging task. This paper examines the potential of Bag-of-Visual Words based techniques for word-wise script identification from video frames. Two different approaches namely, Bag-Of-Features (BoF) and Spatial Pyramid Matching (SPM), using patch based SIFT descriptors were considered for the current study. SVM Classifier was used for analysing the three popular south Indian scripts, namely Tamil, Telugu and Kannada in combination with English and Hindi. A comparative study of Bag-of-Visual words with traditional script identification techniques involving gradient based features (e.g. HoG) and texture based features (e.g. LBP) is presented. Experimental results shows that patch-based features along with SPM outperformed the traditional techniques and promising accuracies were achieved on 2534 words from the five scripts. The study reveals that patch-based feature can be used for scripts identification in-order to overcome the inherent problems with video frames.

Original languageEnglish
Title of host publication2015 International Joint Conference on Neural Networks, IJCNN 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781479919604, 9781479919604, 9781479919604, 9781479919604
Publication statusPublished - 28 Sept 2015
EventInternational Joint Conference on Neural Networks, IJCNN 2015 - Killarney, Ireland
Duration: 12 Jul 201517 Jul 2015

Publication series

NameProceedings of the International Joint Conference on Neural Networks


ConferenceInternational Joint Conference on Neural Networks, IJCNN 2015


  • Accuracy
  • Computational modeling
  • Feature extraction
  • Image resolution
  • Optical character recognition software
  • Vector quantization


Dive into the research topics of 'Bag-of-Visual Words for word-wise video script identification: A study'. Together they form a unique fingerprint.

Cite this