Detecting text in multi-colour images is an important prerequisite. The RBG image is converted into YUV image, after that the multidimensional filter is used to reduce the noise in the YUV image. Canny edge detection is used to measure the continuity of the edges in the images. A efficient text detection is proposed using stroke width transformation method based on contours which can effectively remove the interference of non-stroke edges in complex background and the importance of recent feature (inter-frame feature), in the part of caption extraction(detection, localization). The horizontal and vertical histogram basis is used to calculate the luminance and chrominance which defines the background. Moreover the morphological operation which removes non text areas in the boundaries. Since some background pixels can also have the similar colour, some false stroke areas or character pixels are possible to appear in the output image, which will degrade the recognition rate of OCR (optical character recognition). It exploits the temporal homogeneity of colour of text pixels to filter out some background pixels with similar colour. Optical character recognition enables us to successfully extract the text from an image and convert it into an editable text document. Experimental results evaluated on the Neural network classifier which performance training and testing methods. Training dataset show that our accession yields higher precision and performance compared with forefront methods. The experimental results demonstrate the proposed method will provides efficient result than the existing technique.
Published in | Machine Learning Research (Volume 1, Issue 1) |
DOI | 10.11648/j.mlr.20160101.13 |
Page(s) | 19-32 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2017. Published by Science Publishing Group |
Image Segmentation, Stroke Width Transformation (SWT), Connected Component Analysis (CCA), Histogram of Gradients (HOG), Edge Detection, Neural Network Classifier, Optical Character Recognition
[1] | Yi-Feng Pan, Xinwen Hou, and Cheng-Lin Liu “A Hybrid Approach to Detect and Localize Texts in Natural Scene Images” IEEE transaction on image processing,vol. 20,no. 3,march 2011. |
[2] | Huiping Li, David Doermann, and Omid Kia “Automatic Text Detection and Tracking in Digital Video” IEEE transaction on image processing, vol. 9, no. 1, january 2000. |
[3] | Rainer Lienhart, Member, IEEE, and Axel Wernicke “Localizing and Segmenting Text in Images and Videos” IEEE transaction on circuits and systems for video technology, vol. 12, no. 4, april 2002. |
[4] | Kwang In Kim, Keechul Jung, and Jin Hyung Kim “Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm” IEEE transaction on pattern analysis and machine intelligence, vol. 25, no. 12, december 2003. |
[5] | Bingbing Ni, Zheng Song, and Shuicheng Yan, Senior Member, IEEE “Web Image and Video Mining Towards Universal and Robust Age Estimator” IEEE transaction on multimedia, vol. 13, no. 6, december 2011. |
[6] | Xiaoqian Liu, Member, IEEE, and Weiqiang Wang, Member, IEEE “Robustly Extracting Captions in Videos Based on Stroke-Like Edges and Spatio-Temporal Analysis” IEEE transaction on Multimedia, vol. 14, no.12, April 2012. |
[7] | Xilin Chen, Member, IEEE, Jie Yang, Member, IEEE, Jing Zhang, and Alex Waibel, Member, IEEE “Automatic Detection and Recognition of Signs From Natural Scenes” IEEE transaction on image processing, vol. 13, no. 1, january 2004. |
[8] | Yuri Boykov, Member, IEEE, Olga Veksler, Member, IEEE, and Ramin Zabih, Member, IEEE “Fast Approximate Energy Minimization via Graph Cuts” IEEE transaction on parttern analysis and machine intelligence, vol. 23, no. 11, november 2001. |
[9] | frames Datong Chen∗, Jean-Marc Odobez, Herv/e Bourlard Dalle molle ‘Text detection and recognition in images and video frames’ molle Institute for Perceptual Articial Intelligence (IDIAP), Rue du Simplon 4, Case postale 592, CH1920 Martigny, Switzerland Received 30 December 2002; accepted 20 June 2003. |
[10] | Shaolei Feng, R. Manmatha and Andrew McCallum “Exploring the Use of Conditional Random Field Models and HMMs for Historical Handwritten Document Recognition” Center for Intelligent Information Retrieval University of Massachusetts Amherst, MA, 01003 [slfeng, manmatha, mccallum]@cs.umass.edu,IEEE JANUARY 2002. |
[11] | Niladri B. Puhan, Anthony T. S. Ho, “Localization and text sequence restoration using noise pixels in binary document image watermarking” Journal of Electronic Imaging 18 (2), 023012 (Apr–Jun 2009). |
[12] | Kohei Arai1 and Herman Tolle2 “Text Extraction from TV Commercial using Blob Extraction Method” International Journal of Research and Reviews in Computer Science (IJRRCS) Vol. 2, No. 3, June 2011. |
[13] | Shriram Kishanrao Waghmare, A. K. Gulve, Vikas N. Nirgude,Nagraj P. Kamble “Automatic number plate recognition (ANPR) system for Indian condition using support vector machine (SVM)” International Journal of Computer Science and its Applications. |
[14] | Baoming Shan “Vehicle License Plate Recognition Based on Text-line Construction and Multilevel RBF Neural Network” journal of computers, vol. 6, no. 2, february 2011. |
[15] | Y C Kiran and Lavanya N C, “Text extraction and verification from video based on SVM” World Journal of Science and Technology 2012. |
[16] | Sushmita Mitra, Rajat K. De, and Sankar K. Pal, “Knowledge-Based Fuzzy MLP for Classification and Rule Generation” IEEE transaction on neural networks, vol. 8, no. 6, November 1997. |
[17] | A. Majumdar and B. B. Chaudhuri, “A MLP Classifier for Both Printed and Handwritten Bangla Numeral Recognition” ICVGIP 2006, LNCS 4338, pp. 796 – 804, 2006. © Springer-Verlag Berlin Heidelberg 2006. |
[18] | T Y Lim, M M Ratnam and M A Khalid, “Automatic classification of weld defects using simulated data and an MLP neural network” Insight Vol 49 No 3 March 2007. |
[19] | Somkid Amornsamankul, Jairaj Promrak, Pawalai Kraipeerapun, “Solving Multiclass Classification Problems using Combining Complementary Neural Networks and Error-Correcting Output Codes” International journal of mathematics and computers in simulations, Issue 3, Volume 5, 2011. |
[20] | Guoqiang Peter Zhang, “Neural Networks for Classification: A Survey” IEEE Transactions on systems, man, and cybernetics-part c: applications and reviews, vol. 30, no. 4, November 2000. |
APA Style
S. Kannadhasan, R. Rajesh Baba. (2017). A Novel Approach to Detect Text in Various Dynamic-Colour Images. Machine Learning Research, 1(1), 19-32. https://doi.org/10.11648/j.mlr.20160101.13
ACS Style
S. Kannadhasan; R. Rajesh Baba. A Novel Approach to Detect Text in Various Dynamic-Colour Images. Mach. Learn. Res. 2017, 1(1), 19-32. doi: 10.11648/j.mlr.20160101.13
@article{10.11648/j.mlr.20160101.13, author = {S. Kannadhasan and R. Rajesh Baba}, title = {A Novel Approach to Detect Text in Various Dynamic-Colour Images}, journal = {Machine Learning Research}, volume = {1}, number = {1}, pages = {19-32}, doi = {10.11648/j.mlr.20160101.13}, url = {https://doi.org/10.11648/j.mlr.20160101.13}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.mlr.20160101.13}, abstract = {Detecting text in multi-colour images is an important prerequisite. The RBG image is converted into YUV image, after that the multidimensional filter is used to reduce the noise in the YUV image. Canny edge detection is used to measure the continuity of the edges in the images. A efficient text detection is proposed using stroke width transformation method based on contours which can effectively remove the interference of non-stroke edges in complex background and the importance of recent feature (inter-frame feature), in the part of caption extraction(detection, localization). The horizontal and vertical histogram basis is used to calculate the luminance and chrominance which defines the background. Moreover the morphological operation which removes non text areas in the boundaries. Since some background pixels can also have the similar colour, some false stroke areas or character pixels are possible to appear in the output image, which will degrade the recognition rate of OCR (optical character recognition). It exploits the temporal homogeneity of colour of text pixels to filter out some background pixels with similar colour. Optical character recognition enables us to successfully extract the text from an image and convert it into an editable text document. Experimental results evaluated on the Neural network classifier which performance training and testing methods. Training dataset show that our accession yields higher precision and performance compared with forefront methods. The experimental results demonstrate the proposed method will provides efficient result than the existing technique.}, year = {2017} }
TY - JOUR T1 - A Novel Approach to Detect Text in Various Dynamic-Colour Images AU - S. Kannadhasan AU - R. Rajesh Baba Y1 - 2017/01/19 PY - 2017 N1 - https://doi.org/10.11648/j.mlr.20160101.13 DO - 10.11648/j.mlr.20160101.13 T2 - Machine Learning Research JF - Machine Learning Research JO - Machine Learning Research SP - 19 EP - 32 PB - Science Publishing Group SN - 2637-5680 UR - https://doi.org/10.11648/j.mlr.20160101.13 AB - Detecting text in multi-colour images is an important prerequisite. The RBG image is converted into YUV image, after that the multidimensional filter is used to reduce the noise in the YUV image. Canny edge detection is used to measure the continuity of the edges in the images. A efficient text detection is proposed using stroke width transformation method based on contours which can effectively remove the interference of non-stroke edges in complex background and the importance of recent feature (inter-frame feature), in the part of caption extraction(detection, localization). The horizontal and vertical histogram basis is used to calculate the luminance and chrominance which defines the background. Moreover the morphological operation which removes non text areas in the boundaries. Since some background pixels can also have the similar colour, some false stroke areas or character pixels are possible to appear in the output image, which will degrade the recognition rate of OCR (optical character recognition). It exploits the temporal homogeneity of colour of text pixels to filter out some background pixels with similar colour. Optical character recognition enables us to successfully extract the text from an image and convert it into an editable text document. Experimental results evaluated on the Neural network classifier which performance training and testing methods. Training dataset show that our accession yields higher precision and performance compared with forefront methods. The experimental results demonstrate the proposed method will provides efficient result than the existing technique. VL - 1 IS - 1 ER -