TY - GEN
T1 - Multi-lingual text recognition from video frames
AU - Sharma, Nabin
AU - Mandal, Ranju
AU - Sharma, Rabi
AU - Roy, Partha P.
AU - Pal, Umapada
AU - Blumenstein, Michael
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/11/20
Y1 - 2015/11/20
N2 - Text recognition from video frames is a challenging task due to low resolution, blur, complex and coloured backgrounds, noise, to mention a few. Consequently, the traditional ways of text recognition from scanned documents having simple backgrounds fails when applied to video text. Although there are various techniques available for text recognition from handwritten and printed documents with simple backgrounds, text recognition from video frames has not been comprehensively investigated, especially for multi-lingual videos. In this paper, we present a technique for multi-lingual video text recognition which involves script identification in the first stage, followed by word and character recognition, and finally the results are refined using a post-processing technique. Considering the inherent problems in videos, a Spatial Pyramid Matching (SPM) based technique, using patch-based SIFT descriptors and SVM classifier, is employed for script identification. In the next stage, a Hidden Markov Model (HMM) based approach is used for word and character recognition, which utilizes the context information. Finally, a lexicon-based post-processing technique is applied to verify and refine the word recognition results. The proposed method was tested on a dataset comprising of 4800 words from three different scripts, namely, Roman (English), Hindi and Bengali. The script identification results obtained are encouraging. The word and character recognition results are also encouraging considering the complexity and problems associated with video text processing.
AB - Text recognition from video frames is a challenging task due to low resolution, blur, complex and coloured backgrounds, noise, to mention a few. Consequently, the traditional ways of text recognition from scanned documents having simple backgrounds fails when applied to video text. Although there are various techniques available for text recognition from handwritten and printed documents with simple backgrounds, text recognition from video frames has not been comprehensively investigated, especially for multi-lingual videos. In this paper, we present a technique for multi-lingual video text recognition which involves script identification in the first stage, followed by word and character recognition, and finally the results are refined using a post-processing technique. Considering the inherent problems in videos, a Spatial Pyramid Matching (SPM) based technique, using patch-based SIFT descriptors and SVM classifier, is employed for script identification. In the next stage, a Hidden Markov Model (HMM) based approach is used for word and character recognition, which utilizes the context information. Finally, a lexicon-based post-processing technique is applied to verify and refine the word recognition results. The proposed method was tested on a dataset comprising of 4800 words from three different scripts, namely, Roman (English), Hindi and Bengali. The script identification results obtained are encouraging. The word and character recognition results are also encouraging considering the complexity and problems associated with video text processing.
UR - http://www.scopus.com/inward/record.url?scp=84962584193&partnerID=8YFLogxK
U2 - 10.1109/ICDAR.2015.7333902
DO - 10.1109/ICDAR.2015.7333902
M3 - Conference contribution
AN - SCOPUS:84962584193
T3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
SP - 951
EP - 955
BT - 13th IAPR International Conference on Document Analysis and Recognition, ICDAR 2015 - Conference Proceedings
PB - IEEE Computer Society
T2 - 13th International Conference on Document Analysis and Recognition, ICDAR 2015
Y2 - 23 August 2015 through 26 August 2015
ER -