Skip to main navigation Skip to search Skip to main content

Text detection and script identification in natural scene images using deep learning

  • Jordan University of Science and Technology

Research output: Contribution to journalArticlepeer-review

31 Scopus citations

Abstract

The detection of text in an image and identification of its language are important tasks in optical character recognition. Such tasks are challenging, particularly in natural scene images. Previous studies have been conducted with a focus on convolutional neural networks for script identification. In other studies, fully convolutional networks (FCNs) have been used for model enhancement and not as classifiers. In this study, we use FCNs for both model enhancement and classification. The proposed methodology improves the Efficient and Accurate Scene Text Detector by adding new FCN branches for script identification. Moreover, whereas most end-to-end (e2e) methods train the text detection and script identification models separately, we propose two e2e methods for jointly training the models, namely, multi-channel mask (MCM) and multi-channel segmentation (MCS). The results show that the performance of an MCM is similar to that of other state-of-the-art methods, whereas MCS outperforms existing methods with recall values of 54.34% and 81.13%, when using the ICDAR MLT 2017 and MLe2e datasets, respectively.

Original languageEnglish
Article number107043
JournalComputers and Electrical Engineering
Volume91
DOIs
StatePublished - May 2021
Externally publishedYes

Keywords

  • Deep learning
  • Fully convolution network
  • Natural scene images
  • Oversampling
  • Script identification
  • Text detection
  • Undersampling

Fingerprint

Dive into the research topics of 'Text detection and script identification in natural scene images using deep learning'. Together they form a unique fingerprint.

Cite this