Open Access   Article Go Back

Matrix Method for Distinction between Text and Non-Text Images

P. Karmakar1 , C. Md. Mizan2 , S. Jana3 , S. Dasgupta4 , S. Paul5 , R. Das6 , S. Das7

Section:Research Paper, Product Type: Journal Paper
Volume-07 , Issue-18 , Page no. 13-16, May-2019

Online published on May 25, 2019

Copyright © P. Karmakar, C. Md. Mizan, S. Jana, S. Dasgupta, S. Paul, R. Das, S. Das . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: P. Karmakar, C. Md. Mizan, S. Jana, S. Dasgupta, S. Paul, R. Das, S. Das, “Matrix Method for Distinction between Text and Non-Text Images,” International Journal of Computer Sciences and Engineering, Vol.07, Issue.18, pp.13-16, 2019.

MLA Style Citation: P. Karmakar, C. Md. Mizan, S. Jana, S. Dasgupta, S. Paul, R. Das, S. Das "Matrix Method for Distinction between Text and Non-Text Images." International Journal of Computer Sciences and Engineering 07.18 (2019): 13-16.

APA Style Citation: P. Karmakar, C. Md. Mizan, S. Jana, S. Dasgupta, S. Paul, R. Das, S. Das, (2019). Matrix Method for Distinction between Text and Non-Text Images. International Journal of Computer Sciences and Engineering, 07(18), 13-16.

BibTex Style Citation:
@article{Karmakar_2019,
author = {P. Karmakar, C. Md. Mizan, S. Jana, S. Dasgupta, S. Paul, R. Das, S. Das},
title = {Matrix Method for Distinction between Text and Non-Text Images},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {5 2019},
volume = {07},
Issue = {18},
month = {5},
year = {2019},
issn = {2347-2693},
pages = {13-16},
url = {https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=1326},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=1326
TI - Matrix Method for Distinction between Text and Non-Text Images
T2 - International Journal of Computer Sciences and Engineering
AU - P. Karmakar, C. Md. Mizan, S. Jana, S. Dasgupta, S. Paul, R. Das, S. Das
PY - 2019
DA - 2019/05/25
PB - IJCSE, Indore, INDIA
SP - 13-16
IS - 18
VL - 07
SN - 2347-2693
ER -

           

Abstract

Recognition of text and non-text images is a major challenge in the field of computer vision so as to efficiently extract the text from that image. The algorithm used for the extraction of the text from the images would have a higher efficiency if it is known beforehand that the image is a text image or a non-text image. However, there are many images such as old manuscripts where the extraction of the text becomes very difficult. In that case, the algorithm for the distinction between the text and non-text becomes very easy for text detection and have high accuracy and fast in detecting the text from the image. This method can also be applied to detect and extract the text from the signboards also. In our approach, we had built a system that takes any sort of image as an input. After the input of the image, it is then processed and converted into a binary image. Distance transform method is then applied and the measure of the distance between the various points in the image are then calculated. From the calculated points, duplicate points are merged into one point and are sorted in ascending order. The total area of the binary image is then calculated and also the image corresponding to each of the distance transform points are then calculated. The total area of the binary image is then divided by each of the area value of the corresponding distance transform points are the value extracted is known as the feature values. After getting all the feature values the whole value is then divided into small intervals and is then processed through the classifier. The accuracy of the classifier is then calculated and evaluated for the distinction between text and non-text images. This method is a very simple and accurate method for the distinction between the text and the non-text images and also helps in the extraction of the text from the image. Experiment have been done with simple text and non-text image dataset and the efficiency of the proposed method is then demonstrated.

Key-Words / Index Term

text recognition, distance transform, classifier

References

[1]. Najwa Maria Chidiac, Pascal Damein and Charles Yacoub, “A robust algorithm for text extraction from images”, 39th International conference on Telecommunication and Signal Processing, 2016.
[2]. Radhika Patel and Suman K Mitra, “Extracting text from degraded documents”, 5th National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, 2015.
[3]. R. Malik and SeongAh chin, “Extraction of text in images”, Proceedings of International Conference on Information Intelligence and Systems, 1999.
[4]. Sezer Karaoglu, Ran Tao, Theo Gevers and Arnold W. M. Smeulders, “Words matter: Scene Text for Image Clssification and Retrieval”, IEEE transactions on multimedia, vol. 19, no. 5, may 2017.
[5]. Chengquan Zhang, Cong Yao, Baoguang Shi and Xiang Bai, “Automatic discrimination of text and non-text natural images”, 13th International Conference on Document Analysis and Recognition, 2015.