SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3107
An Enhanced Approach for Extraction of Text from an Image using
Fuzzy Logic
K. Chandra Sekhar1, MD. Mehar Naaz2, K. Haritha3, N. Roshitha Varma4, T. Sivajee5,
P. Sri Ram Naidu6
1Professor, Dept. of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences,
Andhra Pradesh, India
2,3,4,5,6Student, Dept. of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences,
Andhra Pradesh, India
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract – This paper represents the approach for
extracting the text from an image with the method called
“Fuzzy Logic. The role of text recognition is to recognize the
text from an image and then to extract the text in the editable
format, so that people can use the text fortheirfuturepurpose.
People capture lot of useful information which may have
essential textual data to store and may also edit this data.
Fuzzy logic identifies whether thecharacterorthetextpresent
in an image matches with the trained data or not. The image
captured by the user undergoes pre-processing and then
converted into binary image. The text in an image is localized,
segmentation process takes place to extract each character
from image and then characters are identified which are
matching the trained data using Fuzzy Logic.
Key Words: Text Recognition, Text Extraction, Fuzzy
Logic, Pre-processing,Segmentation,Featureextraction.
1. INTRODUCTION
Textual data present in the images contain useful
information for indexing and automatic annotations.
Extraction of this useful information involves text detection,
localization of text, classification, and then recognition of
text. Fuzzy logic determines the degree of truth values. This
logic helps to identify and match the characters accurately
with trained data.
1.1 Image Processing
Image processing is analysis and manipulation of a digitized
image, so as to enhance its quality with the help of
mathematical operations by using any kind of signal
processing where the input isa picture oranimageoravideo
frame. The output of image processingwillbeeitherapicture
or set of characters or parameters associated with the given
input image. This is a set of computational techniques for
analyzing, enhancing, compressing and reconstructing
image. There are different techniques for processing an
image like optical methods, fuzzy techniques, digital
processing, linear scaling.
Image processing generally involves three steps:
 Importing and Loading the image by using image
acquisition tools.
 Analyzing and manipulating image to extract the
information.
 Output the result. The result might be the image ora
picture altered in some way or it may be a report
based on analysis of the image.
1.2 Text Recognition and Extraction Model
The method of text extraction involves mainly two
processes, namely- Text Detection and Text Recognition.
Text detection is a way of localizing various regions from an
image which contains text in it. This helps in removal ofnon-
text regions which behave as noise while extraction of
desired text. Whereas, Text recognition is a process of
converting pixel-based text i.e. image text to readable code.
This recognizes the text from an image by undergoing
several steps such as, pre-processing, segmentation, feature
extraction, classification and post-processing. Later,thetext
extraction is done by comparing the segmented characters
with the trained data with the help of fuzzy logic.
2. FUZZY LOGIC
Fuzzy logic is a form of logic that is used in some expert
systems and in various applications of Artificial Intelligence.
This was originated in 1965 by “Lofti Zadeh”. Fuzzy logic isa
branch of logic which uses degrees of membership in the
form of sets rather than the strict true/false memberships.
The classical logic depends only on the Boolean values 0or1
and this depends upon a lot of relationships, while there are
various relationships where the position that it can be
considered as partly true or as partly wrong at the same
time. Therefore, Fuzzy logic determines the degree of
affiliation, which is extent of grades between right and
wrong. This can manage the vagueness and ambiguity very
efficiently. It has the power to perform reasonable and
meaningful operations. In the process of extraction of text
this logic plays a vital role by takingthesegmentedcharacter
and comparing this segmented character with the trained
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3108
character datasets. After comparing, it finds the accurate
matching value of the character. If character matches
accurately with trained data then the character is extracted
and displayed on the digital screen. If the segmented
character matches with the trained dataset character then
the status becomes ‘1’ and the character is printed,
otherwise status becomes ‘0’ and the processcontinuestill it
finds the matching character.
3. DESIGN AND IMPLEMENTATION
Fig -1: Architecture of the proposed system.
The algorithm involves few steps:
1. Load the original (RGB) image and convert it into
grayscale image.
2. The image undergoes pre-processing to enhance
and remove the noise from an image.
3. The words in the image are classified and localized
and then they are segmented according to the
boundary values.
4. The segmented characters are compared with the
trained dataset with the help of fuzzy logic to find
the highest matching criteria.
5. If the segmented character has highest matching
value with trained data then the character is
displayed or else the same process continues till it
finds the highest matching value i.e. nearer to 1.
6. The characters are displayed in an editable format.
3.1 Pre-processing
The main purpose of pre-processing is to enhance the visual
look of the image. People generally take the picture of the
required document to be extracted as text, but they might
contain some noise. The imagecapturedbytheusermightbe
blurry, noisy, and may be of low resolution. Thus, this
process is used to remove the clamor or noise, stabilize the
intensity of the image and to clear the artifacts. So as to
improve or enhance the quality of the input image given by
the user, few operations are performed in this stage. They
are noise removal, normalization, binarization etc.
3.1.1 Noise Removal
The sources of noise arise in an image due to image
acquisition i.e. digitization and transmission. There are
generally four types of noises, namely: Gaussian noise, Salt
and pepper noise, Speckle noise, Uniform noise. When the
images are sent over different channels, they are prone to
corruption with noise because of the noisy channels. Thus,
filters are required to remove the noise from the images
captured by the user before processing. There are many
kinds of filtering such as linear smoothing filter, median
filter, wiener filter and Fuzzy filter. Thethreeprimaries(R,G
and B) are done separately in the filtering. The three R, G, B
filters are followed by the gain to compensate the reduction
of noise resulting from the filter. Thesefilteredprimariesare
combined to form a filtered colored image. This process is
shown as below:
Fig-2: Filtering
Median filter is utilized for enhancing the image,
Filtered_Image= Median _Filter(Original_Image, Filter_Size)
3.12 Normalization
The range of pixel intensity values are changed in the
process of normalization. In general, normalization means a
mechanism to bring something to normal condition. The
normalized image has mean =0 and variance =1. The range
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3109
of values of the image represented between 0 and 255. The
normalization of an image is performed as,
Output_channel = 255 * (Input_channel - min) / (max-min)
If a grayscale image is used then only one channel needs to
be normalized. However, if we are normalizing an RGB (3
channels) we should normalize for each channel using the
same criteria.
3.2 Segmentation
The process of partitioning a digital image into several
multiple segments or image objects is known as
segmentation and sometimes also referred as object
detection. This process simplifies and changes the features
or representation of an image so that analysis becomes easy
and also to make it more meaningful. This segmentation is
done in three ways, they are line segmentation, word
segmentation, character segmentation. Some of the deep
learning architectures are used for segmentation, one of the
architectures is CNN i.e. Convolution Neural Network.
Segmentation of image using CNN involvesfeedingsegments
of an image as input to a convolution neural network, which
indicates the pixels. The entire image in CNN can’t be
processed. It scans the image or picture, looking at a small
“filter” of several pixels each time until it has mapped the
entire image.
Step-by-step process of how this works:
• Take the weight matrix.
• On top of the image put the weight matrix.
• Element-wise multiplication is performed and output is
considered.
• The weight matrix is moved as per the chosen stride.
• Convolve until all the pixels of the input are used.
Each and every segmented character is placed in the form of
matrix. These matrices are compared with the trained
dataset matrix so as to identify the character.
3.3 Feature Extraction and Classification
Features are the unique signatures of an image or an image
defined with unique properties. Feature extraction is
basically related to dimensionality reduction and is used for
extracting characteristics of an input image. This efficiently
describes components of image as a compressed feature
vector. This kind of approach is useful when size of image is
very large. When the pre-processing and segmentation on
the image is done, some feature extraction technique is
applied to the segmented characters to acquire features,
which is then followed by classification and post processing
techniques. Some of the techniques like statistical and
geometrical features are used in feature extraction process.
The process of recognizing and extraction of character is
divided into 2 stages: Feature selection and classification.
The main aim of features selection is to select a subset of
input variables by cutout features with weakly or no
predictive information while maintaining or performing
classification accuracy, whereas Image classification is
acceptance of the input image and the extracting characters
and assigning them to correct class or a category.
4. RESULT
Fig:3(a) Extracted text from input image
Fig:3(b) Extracted text from input image
5. CONCLUSION
Applications need several kinds of images as sources of
information for elucidation and analysis. The characters
which are identified are classified into meaningful word or
sentence. When an image is transformed from one from to
another by digitizing, scanning, processing is done through
tesseract by storing the identified data and this identified
fata is compared with the trained data using the Fuzzy logic.
Therefore, the output image has to undergo a process called
image enhancement, which has a groupofmethodsthatseek
to develop the visual presence of a picture or an image. In
this paper, we have successfully able to extract the text from
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3110
the image of any kind of font style and size with the help of
fuzzy logic. The intend of Fuzzy rules aretheattractiveresult
to improve the quality of edges and find the accurate
characters from the image. This paper will act as a good
survey of researchers who have begun work in the field of
fuzzy character recognition.
REFERENCES
[1] L. Neumann and J. Matas. A method for text localization
and recognition in real-world images. In ACCV, pages 770–
783, 2010.
[2] Reza Sarshogh and Keegan Hines,” Computer Vision
Methods for Extracting Text from Images”, Capital One
Tech.
[3] T.Som, Sumit Saha,"Handwritten Character Recognition
Using Fuzzy Membership Function", International
Journal of Emerging Technologies in Sciences and
Engineering, Volume 5, December 2011
[4] L. A. Zadeh. Fuzzy sets, Information Control 8 (1965) 338–
353.
[5] Gur, Eran, and ZeevZelavsky, “Retrieval of Rashi Semi-
Cursive Handwriting via Fuzzy Logic,” IEEE
International Conference on Frontiers in Handwriting
Recognition (ICFHR), 2012
[6] Thomas Natsvhlager, “Optical CharacterRecognition”,A
Tutorial for the Course Computational Intelligence.
[7] Andrei Polzounov,ArtsiomAblavatski , Sergio Escalera,
Shijian Lu, JianfeiCai “Wordfence: Text Detection In
Natural Images With Border Awareness”
[8] D.Trier ,A.K.Jain ,T.Taxt , “Feature ExtractionMethodfor
Character Recognition-A Survey” ,Pattern Recognition

More Related Content

What's hot (19)

PDF
B018110915
IOSR Journals
 
PDF
Extended fuzzy c means clustering algorithm in segmentation of noisy images
International Journal of Science and Research (IJSR)
 
PDF
Property based fusion for multifocus images
IAEME Publication
 
PDF
Segmentation and recognition of handwritten digit numeral string using a mult...
ijfcstjournal
 
PDF
Text Extraction and Recognition Using Median Filter
IRJET Journal
 
PDF
A Novel Approach To Detection and Evaluation of Resampled Tampered Images
CSCJournals
 
PDF
Automatic Image Annotation Using CMRM with Scene Information
TELKOMNIKA JOURNAL
 
PDF
Feature Extraction and Feature Selection using Textual Analysis
vivatechijri
 
PDF
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
IRJET Journal
 
PDF
2015.basicsof imageanalysischapter2 (1)
moemi1
 
PDF
10.1.1.432.9149
moemi1
 
PDF
22 29 dec16 8nov16 13272 28268-1-ed(edit)
IAESIJEECS
 
PDF
A novel embedded hybrid thinning algorithm for
prjpublications
 
DOC
Morpho
Subbu Akili
 
PDF
A Review on Image Segmentation using Clustering and Swarm Optimization Techni...
IJSRD
 
PDF
CONTRAST ENHANCEMENT TECHNIQUES USING HISTOGRAM EQUALIZATION METHODS ON COLOR...
IJCSEA Journal
 
PDF
A Simple Segmentation Approach for Unconstrained Cursive Handwritten Words in...
CSCJournals
 
PDF
F0342032038
ijceronline
 
PDF
Fuzzy Region Merging Using Fuzzy Similarity Measurement on Image Segmentation
IJECEIAES
 
B018110915
IOSR Journals
 
Extended fuzzy c means clustering algorithm in segmentation of noisy images
International Journal of Science and Research (IJSR)
 
Property based fusion for multifocus images
IAEME Publication
 
Segmentation and recognition of handwritten digit numeral string using a mult...
ijfcstjournal
 
Text Extraction and Recognition Using Median Filter
IRJET Journal
 
A Novel Approach To Detection and Evaluation of Resampled Tampered Images
CSCJournals
 
Automatic Image Annotation Using CMRM with Scene Information
TELKOMNIKA JOURNAL
 
Feature Extraction and Feature Selection using Textual Analysis
vivatechijri
 
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
IRJET Journal
 
2015.basicsof imageanalysischapter2 (1)
moemi1
 
10.1.1.432.9149
moemi1
 
22 29 dec16 8nov16 13272 28268-1-ed(edit)
IAESIJEECS
 
A novel embedded hybrid thinning algorithm for
prjpublications
 
Morpho
Subbu Akili
 
A Review on Image Segmentation using Clustering and Swarm Optimization Techni...
IJSRD
 
CONTRAST ENHANCEMENT TECHNIQUES USING HISTOGRAM EQUALIZATION METHODS ON COLOR...
IJCSEA Journal
 
A Simple Segmentation Approach for Unconstrained Cursive Handwritten Words in...
CSCJournals
 
F0342032038
ijceronline
 
Fuzzy Region Merging Using Fuzzy Similarity Measurement on Image Segmentation
IJECEIAES
 

Similar to IRJET - An Enhanced Approach for Extraction of Text from an Image using Fuzzy Logic (20)

PDF
AN EMERGING TREND OF FEATURE EXTRACTION METHOD IN VIDEO PROCESSING
cscpconf
 
PDF
IRJET - Simulation of Colour Image Processing Techniques on VHDL
IRJET Journal
 
PDF
Enhancing readability of digital image using image processing - Full Report
Upendra Sachan
 
PDF
IRJET-MText Extraction from Images using Convolutional Neural Network
IRJET Journal
 
PPTX
deona
Deona Noble
 
PDF
IRJET- Proposed Approach for Layout & Handwritten Character Recognization in OCR
IRJET Journal
 
PPT
Digital Image Processing
Azharo7
 
PDF
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET Journal
 
PDF
De-Noisy Image of Activity Tracking System in Digital Image Processing
IRJET Journal
 
PDF
De-Noisy Image of Activity Tracking System in Digital Image Processing
IRJET Journal
 
PPTX
Texture features based text extraction from images using DWT and K-means clus...
Divya Gera
 
PDF
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
IRJET Journal
 
PDF
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
IRJET Journal
 
PDF
Text Extraction System by Eliminating Non-Text Regions
IJCSIS Research Publications
 
PDF
Journal Publishers
graphicdesigner79
 
PDF
Edge Detection Using Fuzzy Logic with Varied Inputs
paperpublications3
 
PDF
A novel method for character segmentation of vehicle
eSAT Publishing House
 
PDF
IRJET- Information Retrieval & Text Analytics using Artificial Intelligence
IRJET Journal
 
PDF
Emblematical image based pattern recognition paradigm using Multi-Layer Perce...
iosrjce
 
AN EMERGING TREND OF FEATURE EXTRACTION METHOD IN VIDEO PROCESSING
cscpconf
 
IRJET - Simulation of Colour Image Processing Techniques on VHDL
IRJET Journal
 
Enhancing readability of digital image using image processing - Full Report
Upendra Sachan
 
IRJET-MText Extraction from Images using Convolutional Neural Network
IRJET Journal
 
IRJET- Proposed Approach for Layout & Handwritten Character Recognization in OCR
IRJET Journal
 
Digital Image Processing
Azharo7
 
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET Journal
 
De-Noisy Image of Activity Tracking System in Digital Image Processing
IRJET Journal
 
De-Noisy Image of Activity Tracking System in Digital Image Processing
IRJET Journal
 
Texture features based text extraction from images using DWT and K-means clus...
Divya Gera
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
IRJET Journal
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
IRJET Journal
 
Text Extraction System by Eliminating Non-Text Regions
IJCSIS Research Publications
 
Journal Publishers
graphicdesigner79
 
Edge Detection Using Fuzzy Logic with Varied Inputs
paperpublications3
 
A novel method for character segmentation of vehicle
eSAT Publishing House
 
IRJET- Information Retrieval & Text Analytics using Artificial Intelligence
IRJET Journal
 
Emblematical image based pattern recognition paradigm using Multi-Layer Perce...
iosrjce
 
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
PDF
Kiona – A Smart Society Automation Project
IRJET Journal
 
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
PDF
Breast Cancer Detection using Computer Vision
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
Kiona – A Smart Society Automation Project
IRJET Journal
 
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
Breast Cancer Detection using Computer Vision
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Ad

Recently uploaded (20)

PPTX
REINFORCEMENT AS CONSTRUCTION MATERIALS.pptx
mohaiminulhaquesami
 
PPTX
Thermal runway and thermal stability.pptx
godow93766
 
DOCX
8th International Conference on Electrical Engineering (ELEN 2025)
elelijjournal653
 
PDF
POWER PLANT ENGINEERING (R17A0326).pdf..
haneefachosa123
 
PDF
Set Relation Function Practice session 24.05.2025.pdf
DrStephenStrange4
 
PPTX
Benefits_^0_Challigi😙🏡💐8fenges[1].pptx
akghostmaker
 
PPTX
MobileComputingMANET2023 MobileComputingMANET2023.pptx
masterfake98765
 
PDF
ARC--BUILDING-UTILITIES-2-PART-2 (1).pdf
IzzyBaniquedBusto
 
PDF
Unified_Cloud_Comm_Presentation anil singh ppt
anilsingh298751
 
PDF
Zilliz Cloud Demo for performance and scale
Zilliz
 
PPTX
Introduction to Neural Networks and Perceptron Learning Algorithm.pptx
Kayalvizhi A
 
PDF
monopile foundation seminar topic for civil engineering students
Ahina5
 
PPTX
Structural Functiona theory this important for the theorist
cagumaydanny26
 
PPTX
265587293-NFPA 101 Life safety code-PPT-1.pptx
chandermwason
 
PDF
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
PPTX
Pharmaceuticals and fine chemicals.pptxx
jaypa242004
 
PDF
UNIT-4-FEEDBACK AMPLIFIERS AND OSCILLATORS (1).pdf
Sridhar191373
 
PDF
PRIZ Academy - Change Flow Thinking Master Change with Confidence.pdf
PRIZ Guru
 
PPTX
Presentation on Foundation Design for Civil Engineers.pptx
KamalKhan563106
 
PDF
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
REINFORCEMENT AS CONSTRUCTION MATERIALS.pptx
mohaiminulhaquesami
 
Thermal runway and thermal stability.pptx
godow93766
 
8th International Conference on Electrical Engineering (ELEN 2025)
elelijjournal653
 
POWER PLANT ENGINEERING (R17A0326).pdf..
haneefachosa123
 
Set Relation Function Practice session 24.05.2025.pdf
DrStephenStrange4
 
Benefits_^0_Challigi😙🏡💐8fenges[1].pptx
akghostmaker
 
MobileComputingMANET2023 MobileComputingMANET2023.pptx
masterfake98765
 
ARC--BUILDING-UTILITIES-2-PART-2 (1).pdf
IzzyBaniquedBusto
 
Unified_Cloud_Comm_Presentation anil singh ppt
anilsingh298751
 
Zilliz Cloud Demo for performance and scale
Zilliz
 
Introduction to Neural Networks and Perceptron Learning Algorithm.pptx
Kayalvizhi A
 
monopile foundation seminar topic for civil engineering students
Ahina5
 
Structural Functiona theory this important for the theorist
cagumaydanny26
 
265587293-NFPA 101 Life safety code-PPT-1.pptx
chandermwason
 
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
Pharmaceuticals and fine chemicals.pptxx
jaypa242004
 
UNIT-4-FEEDBACK AMPLIFIERS AND OSCILLATORS (1).pdf
Sridhar191373
 
PRIZ Academy - Change Flow Thinking Master Change with Confidence.pdf
PRIZ Guru
 
Presentation on Foundation Design for Civil Engineers.pptx
KamalKhan563106
 
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 

IRJET - An Enhanced Approach for Extraction of Text from an Image using Fuzzy Logic

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3107 An Enhanced Approach for Extraction of Text from an Image using Fuzzy Logic K. Chandra Sekhar1, MD. Mehar Naaz2, K. Haritha3, N. Roshitha Varma4, T. Sivajee5, P. Sri Ram Naidu6 1Professor, Dept. of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences, Andhra Pradesh, India 2,3,4,5,6Student, Dept. of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences, Andhra Pradesh, India ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract – This paper represents the approach for extracting the text from an image with the method called “Fuzzy Logic. The role of text recognition is to recognize the text from an image and then to extract the text in the editable format, so that people can use the text fortheirfuturepurpose. People capture lot of useful information which may have essential textual data to store and may also edit this data. Fuzzy logic identifies whether thecharacterorthetextpresent in an image matches with the trained data or not. The image captured by the user undergoes pre-processing and then converted into binary image. The text in an image is localized, segmentation process takes place to extract each character from image and then characters are identified which are matching the trained data using Fuzzy Logic. Key Words: Text Recognition, Text Extraction, Fuzzy Logic, Pre-processing,Segmentation,Featureextraction. 1. INTRODUCTION Textual data present in the images contain useful information for indexing and automatic annotations. Extraction of this useful information involves text detection, localization of text, classification, and then recognition of text. Fuzzy logic determines the degree of truth values. This logic helps to identify and match the characters accurately with trained data. 1.1 Image Processing Image processing is analysis and manipulation of a digitized image, so as to enhance its quality with the help of mathematical operations by using any kind of signal processing where the input isa picture oranimageoravideo frame. The output of image processingwillbeeitherapicture or set of characters or parameters associated with the given input image. This is a set of computational techniques for analyzing, enhancing, compressing and reconstructing image. There are different techniques for processing an image like optical methods, fuzzy techniques, digital processing, linear scaling. Image processing generally involves three steps:  Importing and Loading the image by using image acquisition tools.  Analyzing and manipulating image to extract the information.  Output the result. The result might be the image ora picture altered in some way or it may be a report based on analysis of the image. 1.2 Text Recognition and Extraction Model The method of text extraction involves mainly two processes, namely- Text Detection and Text Recognition. Text detection is a way of localizing various regions from an image which contains text in it. This helps in removal ofnon- text regions which behave as noise while extraction of desired text. Whereas, Text recognition is a process of converting pixel-based text i.e. image text to readable code. This recognizes the text from an image by undergoing several steps such as, pre-processing, segmentation, feature extraction, classification and post-processing. Later,thetext extraction is done by comparing the segmented characters with the trained data with the help of fuzzy logic. 2. FUZZY LOGIC Fuzzy logic is a form of logic that is used in some expert systems and in various applications of Artificial Intelligence. This was originated in 1965 by “Lofti Zadeh”. Fuzzy logic isa branch of logic which uses degrees of membership in the form of sets rather than the strict true/false memberships. The classical logic depends only on the Boolean values 0or1 and this depends upon a lot of relationships, while there are various relationships where the position that it can be considered as partly true or as partly wrong at the same time. Therefore, Fuzzy logic determines the degree of affiliation, which is extent of grades between right and wrong. This can manage the vagueness and ambiguity very efficiently. It has the power to perform reasonable and meaningful operations. In the process of extraction of text this logic plays a vital role by takingthesegmentedcharacter and comparing this segmented character with the trained
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3108 character datasets. After comparing, it finds the accurate matching value of the character. If character matches accurately with trained data then the character is extracted and displayed on the digital screen. If the segmented character matches with the trained dataset character then the status becomes ‘1’ and the character is printed, otherwise status becomes ‘0’ and the processcontinuestill it finds the matching character. 3. DESIGN AND IMPLEMENTATION Fig -1: Architecture of the proposed system. The algorithm involves few steps: 1. Load the original (RGB) image and convert it into grayscale image. 2. The image undergoes pre-processing to enhance and remove the noise from an image. 3. The words in the image are classified and localized and then they are segmented according to the boundary values. 4. The segmented characters are compared with the trained dataset with the help of fuzzy logic to find the highest matching criteria. 5. If the segmented character has highest matching value with trained data then the character is displayed or else the same process continues till it finds the highest matching value i.e. nearer to 1. 6. The characters are displayed in an editable format. 3.1 Pre-processing The main purpose of pre-processing is to enhance the visual look of the image. People generally take the picture of the required document to be extracted as text, but they might contain some noise. The imagecapturedbytheusermightbe blurry, noisy, and may be of low resolution. Thus, this process is used to remove the clamor or noise, stabilize the intensity of the image and to clear the artifacts. So as to improve or enhance the quality of the input image given by the user, few operations are performed in this stage. They are noise removal, normalization, binarization etc. 3.1.1 Noise Removal The sources of noise arise in an image due to image acquisition i.e. digitization and transmission. There are generally four types of noises, namely: Gaussian noise, Salt and pepper noise, Speckle noise, Uniform noise. When the images are sent over different channels, they are prone to corruption with noise because of the noisy channels. Thus, filters are required to remove the noise from the images captured by the user before processing. There are many kinds of filtering such as linear smoothing filter, median filter, wiener filter and Fuzzy filter. Thethreeprimaries(R,G and B) are done separately in the filtering. The three R, G, B filters are followed by the gain to compensate the reduction of noise resulting from the filter. Thesefilteredprimariesare combined to form a filtered colored image. This process is shown as below: Fig-2: Filtering Median filter is utilized for enhancing the image, Filtered_Image= Median _Filter(Original_Image, Filter_Size) 3.12 Normalization The range of pixel intensity values are changed in the process of normalization. In general, normalization means a mechanism to bring something to normal condition. The normalized image has mean =0 and variance =1. The range
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3109 of values of the image represented between 0 and 255. The normalization of an image is performed as, Output_channel = 255 * (Input_channel - min) / (max-min) If a grayscale image is used then only one channel needs to be normalized. However, if we are normalizing an RGB (3 channels) we should normalize for each channel using the same criteria. 3.2 Segmentation The process of partitioning a digital image into several multiple segments or image objects is known as segmentation and sometimes also referred as object detection. This process simplifies and changes the features or representation of an image so that analysis becomes easy and also to make it more meaningful. This segmentation is done in three ways, they are line segmentation, word segmentation, character segmentation. Some of the deep learning architectures are used for segmentation, one of the architectures is CNN i.e. Convolution Neural Network. Segmentation of image using CNN involvesfeedingsegments of an image as input to a convolution neural network, which indicates the pixels. The entire image in CNN can’t be processed. It scans the image or picture, looking at a small “filter” of several pixels each time until it has mapped the entire image. Step-by-step process of how this works: • Take the weight matrix. • On top of the image put the weight matrix. • Element-wise multiplication is performed and output is considered. • The weight matrix is moved as per the chosen stride. • Convolve until all the pixels of the input are used. Each and every segmented character is placed in the form of matrix. These matrices are compared with the trained dataset matrix so as to identify the character. 3.3 Feature Extraction and Classification Features are the unique signatures of an image or an image defined with unique properties. Feature extraction is basically related to dimensionality reduction and is used for extracting characteristics of an input image. This efficiently describes components of image as a compressed feature vector. This kind of approach is useful when size of image is very large. When the pre-processing and segmentation on the image is done, some feature extraction technique is applied to the segmented characters to acquire features, which is then followed by classification and post processing techniques. Some of the techniques like statistical and geometrical features are used in feature extraction process. The process of recognizing and extraction of character is divided into 2 stages: Feature selection and classification. The main aim of features selection is to select a subset of input variables by cutout features with weakly or no predictive information while maintaining or performing classification accuracy, whereas Image classification is acceptance of the input image and the extracting characters and assigning them to correct class or a category. 4. RESULT Fig:3(a) Extracted text from input image Fig:3(b) Extracted text from input image 5. CONCLUSION Applications need several kinds of images as sources of information for elucidation and analysis. The characters which are identified are classified into meaningful word or sentence. When an image is transformed from one from to another by digitizing, scanning, processing is done through tesseract by storing the identified data and this identified fata is compared with the trained data using the Fuzzy logic. Therefore, the output image has to undergo a process called image enhancement, which has a groupofmethodsthatseek to develop the visual presence of a picture or an image. In this paper, we have successfully able to extract the text from
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3110 the image of any kind of font style and size with the help of fuzzy logic. The intend of Fuzzy rules aretheattractiveresult to improve the quality of edges and find the accurate characters from the image. This paper will act as a good survey of researchers who have begun work in the field of fuzzy character recognition. REFERENCES [1] L. Neumann and J. Matas. A method for text localization and recognition in real-world images. In ACCV, pages 770– 783, 2010. [2] Reza Sarshogh and Keegan Hines,” Computer Vision Methods for Extracting Text from Images”, Capital One Tech. [3] T.Som, Sumit Saha,"Handwritten Character Recognition Using Fuzzy Membership Function", International Journal of Emerging Technologies in Sciences and Engineering, Volume 5, December 2011 [4] L. A. Zadeh. Fuzzy sets, Information Control 8 (1965) 338– 353. [5] Gur, Eran, and ZeevZelavsky, “Retrieval of Rashi Semi- Cursive Handwriting via Fuzzy Logic,” IEEE International Conference on Frontiers in Handwriting Recognition (ICFHR), 2012 [6] Thomas Natsvhlager, “Optical CharacterRecognition”,A Tutorial for the Course Computational Intelligence. [7] Andrei Polzounov,ArtsiomAblavatski , Sergio Escalera, Shijian Lu, JianfeiCai “Wordfence: Text Detection In Natural Images With Border Awareness” [8] D.Trier ,A.K.Jain ,T.Taxt , “Feature ExtractionMethodfor Character Recognition-A Survey” ,Pattern Recognition