SlideShare a Scribd company logo
ENHANCING READABILITY OF DIGITAL
IMAGE USING IMAGE PROCESSING
TECHNIQUES
A project report submitted in partial fulfillment of the requirements for
B.Tech. Project
B.Tech.
by
Akshay Kumar Lodha (2013IPG-007)
Sudesh Sindoskar (2013IPG-109)
Upendra Singh Sachan (2013IPG-118)
ABV INDIAN INSTITUTE OF INFORMATION
TECHNOLOGY AND MANAGEMENT
GWALIOR-474 010
2016
1
ABSTRACT
Smartphones have penetrated everyone’s lifestyle. With advance of technology, the
smartphones have become very powerful computers with very small form-factor. The
camera is an essential feature of every smartphone. We tend to digitize the printed
documents by clicking their photograph using the smartphone so that we can have a
digital backup for future reference or for ease of portability while reading on-the-go.
But this comes with added costs like:
(1) Reduced image quality results in reduced readability
(2) Noises like unwanted background keeps distracting the reader
(3) The uneven texture of photograph due to different lighting condition like shadows,
etc.
As reading is an experience, readers want a noiseless environment. These things dis-
tracts users from reading and hence reduce their concentration and attention span. We
can use different image processing techniques which can greatly increase the readabil-
ity of the document by removing/reducing noise and enhancing image.
TABLE OF CONTENTS
ABSTRACT 1
LIST OF FIGURES 3
1 INTRODUCTION AND LITERATURE SURVEY 5
1.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Image Processing Techniques . . . . . . . . . . . . . . . . . . 6
1.1.1.1 Image Pre-processing . . . . . . . . . . . . . . . . . 7
1.1.1.2 Image Enhancement . . . . . . . . . . . . . . . . . 7
1.1.1.3 Image Segmentation . . . . . . . . . . . . . . . . . 7
1.1.1.4 Feature Extraction . . . . . . . . . . . . . . . . . . 7
1.1.1.5 Image Classification . . . . . . . . . . . . . . . . . 7
1.1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 OpenCV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 DESIGN DETAILS AND IMPLEMENTATION 11
2.1 Detecting the edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.2 Canny Edge Detection . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Finding the contour . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Perspective Transformation . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.1 Perspective Transformation Matrix . . . . . . . . . . . . . . . . 18
2.4 Image Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 RESULTS AND DISCUSSION 25
3.1 RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 DISCUSSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 CONCLUSION 34
2
TABLE OF CONTENTS 3
REFERENCES 34
LIST OF FIGURES
2.1 Basic Flow Diagram of Project . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Kernel = [1 x 1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Kernel = [3 x 3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Kernel = [5 x 5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Kernel = [7 x 7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.6 Original Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.7 Canny Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.8 Highlighting contour of document detected in the image . . . . . . . . 18
2.9 Image obtained after Perspective Trasformation . . . . . . . . . . . . . 21
2.10 Image enhanced using Histogram Equalization . . . . . . . . . . . . . 21
2.11 Image enhanced using Localized Histogram Equalization . . . . . . . . 22
2.12 Image enhanced using Adaptive Thresholding . . . . . . . . . . . . . . 22
2.13 Detailed flowchart of project . . . . . . . . . . . . . . . . . . . . . . . 23
2.14 Detailed flowchart of OpenCV code . . . . . . . . . . . . . . . . . . . 24
3.1 Result: Image Set 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Result: Document photographed from top . . . . . . . . . . . . . . . . 27
3.3 Result: Document photographed at an angle . . . . . . . . . . . . . . . 28
3.4 Result: Only the largest rectangle is selected . . . . . . . . . . . . . . . 29
3.5 Result: Image of Business Card . . . . . . . . . . . . . . . . . . . . . 30
3.6 Result: A coloured newspaper clipping . . . . . . . . . . . . . . . . . . 31
3.7 Result: Document with tables. Challenging example for finding contour 32
3.8 Result: Document with more than four edges is not detected . . . . . . 33
4
CHAPTER 1
INTRODUCTION AND
LITERATURE SURVEY
This chapter includes the details of image processing techniques, problem statement,
platform used to implement the project and literature reviews related to work done in
this field.
1.1 INTRODUCTION
One of the most important devices one can use to help their life is a smartphone. There
is so much that you can do with a smartphone and so many different ways in which they
play a key role in your life. It enhance and advance one’s personal life and working
life. With the advance of technology, the cost of processing speed is greatly reduced
and smartphones nowadays are nothing but a powerful full fledged computer with very
small form-factor. Now, everyone tend to work using their smartphone. So, everything
is going digital. But still, much of the information we access everyday is in printed
form. So, we digitize this by clicking the photograph of these documents and save
them in our smartphone for future reference or anytime access or reading them on their
smartphones. But this comes with some drawbacks such as:
(a) Images clicked are dependent on the quality of camera of the smartphone and hence
reducing the image quality which results in reduced readability
(b) Noise like unwanted background keeps distracting the reader
(c) The shadows projected over the document, when we try to click photograph from
top, reduces readability
(d) Tilted photographs tend to put extra pressure on reader’s mind
5
CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY 6
Reading is an experience. The external factors which effects the reading speed of a
reader are readability of document (presentation) and concentration. The noises like
unwanted background, difficult to read image, etc. distracts the reader which decreases
their concentration and attention span.
For making a document more readable and presentable, we can do the following:
(a) Remove unwanted background
(b) Transform perspective of document to Top View
(c) Enhance the image by making text more readable and background less visible.
1.1.1 Image Processing Techniques
The definition of image processing is to processing of a digital image, i.e. extracting
the noise and any kind of irregularities available in an image using the digital computer.
The irregularity or noise may creep into the digital image either during its formation or
during its transformation etc. For mathematical analysis, an image may be elucidated
as a 2D function f(x,y) here x and y are spatial (plane) coordinates, and the amplitude of
f at any set of coordinates (x, y) is used as the intensity or gray level of the digital image
at that point. When x, y, and the intensity values of f are all limited, discrete quantities,
we can call the image a digital image. It is very significant that a digital image is made
of a finite number of elements, each one of them has a different value and location.
These elements are called image elements or pixels. Pixel is the most frequently used
term to describe the elements of a digital imageImage processing (2016).
Numerous techniques have been introduced in Image Processing during the last four to
five decades. Most of them are developed for enhancing images obtained from mili-
tary investigation flights, unmanned space crafts and space research. Image Processing
techniques are favoured due to easy accessibility of powerful computers, graphics soft-
ware, large size memory devices, etc.
The different Image Processing techniques are:
(i) Image pre-processing
(ii) Image enhancement
(iii) Image segmentation
(iv) Feature extraction
(v) Image classification
CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY 7
1.1.1.1 Image Pre-processing
In image pre-processing, image data registered by sensors or cameras prevent errors
associated with geometry and brightness values of the pixels. These errors are improved
using suitable mathematical models which are either definite or statistical models.
1.1.1.2 Image Enhancement
Image enhancement is the improvisation of digital image by changing the pixel bright-
ness values to modify its visual impact. Image enhancement includes a collection of
techniques that are used to modify the visual appearance of a digital image, or to change
the image to a form which is better suited for human or machine interpretationWang
et al. (2004).
1.1.1.3 Image Segmentation
Segmentation is one of the main problems in image processing. Image segmentation is
the method that classifies a digital image into its integral parts or objects. The depth to
which this subdivision or classification is carried out lies on the problem being solved,
i.e., the segmentation should terminate when the objects on interest in an application
have been isolated e.g., in independent air-to-ground target accession, assume our in-
terest is to distinguish vehicles on a road, the first step is to segment the road from the
digital image and then to subdivide the objects of the road down to probable vehicles.
Image thresholding methods are used for image segmentation.
1.1.1.4 Feature Extraction
The feature extraction methods are developed to extract features in synthetic aperture
radar images. This method withdraws high-level features required in order to conduct
classification of targets. Features are those items which individually narrate a target,
such as dimension, location, structure, composition etc. Segmentation methods are
applied to extract the wanted object from the scene so that measurements can be made
on it eventually. Quantitative measurements of object feature permit classification and
description of the image.
1.1.1.5 Image Classification
Image classification is the marking of a pixel or a group of pixels on the basis of its
grey value. Image classification is frequently used method of information extraction.
In Classification, mostly multiple features are implemented in a set of pixels i.e., many
images of a specific object are required.
CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY 8
1.1.2 Problem Statement
To enhance the readability of a photographed printed document using image processing
techniques by removing noise, enhancing image and transforming perspective.
1.1.3 Motivation
The main focus of this project is to develop a system to increase readability of the digital
image of a document by using image processing and computer vision. As reading is all
about experience, the noise free document is a must for every reader.
1.2 OpenCV
We have used OpenCV using Python interface to implement our project. OpenCV
(Open Source Computer Vision) is a library of programming functions mainly aimed
at real-time computer vision. OpenCV | OpenCV (2016) OpenCV is aimed at providing
the tools needed to solve computer-vision problems. It contains a mix of low-level
image-processing functions and high-level algorithms such as face detection, pedestrian
detection, feature matching, and tracking. The library has been downloaded more than
3 million times. OpenCV (2016)
1.3 Literature Review
We mainly deal with the literatures about different image processing algorithms cover-
ing topics in Image Enhancement, Feature extraction and Image Segmentation.
-B. Chitradevi and P.Srimathi (2014) gives an overview on Image Processing Tech-
niques in their report. The preservation of original data precision, versatility and re-
peatability are thr principle advantages of Digital Image Processing. The various image
processing techniques are Image preprocessing, image enhancement, image segmenta-
tion, feature extraction and image classification. The image enhancement algorithms
are interactive and application dependent. Some image enhancement techniques are
Contrast stretching, Noise filtering, Histogram modification. Chitradevi and P.Srimathi
(1970)
-Bing Wang and ShaoSheng Fan (2009) proposed that traditional Canny Edge detection
algorithm has the defect that being vulnerable to various noise disturbances. It can be
greatly improved by replacing the pre-processing step Gaussian filter with self-adaptive
filter and further improved by morphological thinning to thin the edge. Wang and Fan
CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY 9
(2009)
The use of Gaussian filter in traditional algorithm posed a problem by not only smooth-
ing the noise but smoothing the edges also, so some edge is weakened. The difficulty
to determine the Dual-threshold can’t remove the noise or makes lose some edge.
-In the work by T.L.Tan, K.S. Sim and C.P.Tso (2012), a new approach of nonlin-
ear Histogram Equalization is presented, which is not only able to enhance the image
contrast but it also preserves the background brightness. Tan et al. (2012)
-In the work by Kaiming He, Jian Sun, Xiaoou Tang (2009), they have proposed a
simple but effective image prior called dark channel prior to remove the haze from a
single input image. The dark channel prior is a kind of statistics of the haze-free out-
door images based on a key observation that most local patches in haze-free outdoor
images contain some pixels which have very low intensities in at least one color chan-
nel. Using this prior with the haze imaging model, they directly estimated the thickness
of haze and recover high quality haze-free image. He et al. (2011)
-Michael Donoser and Horst Bischof (2006) proposed a novel concept for tracking of
Maximally Stable Extremal Regions (MSERs). The approach uses temporal informa-
tion to improve the computational time and detect stability of single MSER. This can
be widely used in tracking license plates, face tracking, segmenting the fiber network,
etc. Donoser and Bischof (2006)
-Per-Erik Forssen and David G.Lowe (2007) introduced an affine invariant shape de-
scriptor for maximally stable extremal regions. The descriptor is computed using the
scale invariant feature transform (SIFT), with re-sampled MSER binary mask as input.
This provides better robustness to illumination change and nearby occlusions than ex-
isting methods. Forssen and Lowe (2007)
-In the work by M.Cheriet, J.N. Said, C.Y. Suen (1998), the presented a general recur-
sive thresholding technique for image segmentation by extending Otsu’s method. This
approach of image segmentation has been implemented in the scope of document im-
ages, specifically real-life bank cheques. At each recursion, the new approach segments
first the object with lowest intensity from the given image and this process continues
until there is one object left in the image. Cheriet et al. (1998)
-In the work done by Pranay Yadav (2015), proposed a method for noise removal in
coloured image by modified adaptive threshold median filter. The proposed filter is
very effective for random valued impulse noise because practically noise is not uniform
CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY 10
over the channel. He used concept of both minimum and maximum thresholds to detect
the positive and negative noise. The threshold values are different for different noise
density. Yadav (2015)
- Md. Habibur Rahman and Md. Rafiqul Islam (2013) proposed a modified version
of the watershed algorithm for image segmentation. The overcome the problem of over
segmentation by introducing adaptive masking and a thresholding mechanism over each
color channel befor combining the segmentation from each channel into the final one.
The proposed modified watershed algorithm is also faster than many other segmenta-
tion algorithms, making it fit for real-time application. Rahman and Islam (2013)
CHAPTER 2
DESIGN DETAILS AND
IMPLEMENTATION
To begin, first we need to thoroughly analyse the properties of document that we need to
extract from the image. The typical properties and assumptions of image of a document
are:
(i) The image clicked is clear enough to read the contents of the document.
(ii) Generally, the content (text) is in black ink while the background (paper) is white
coloured.
(iii) The document has exactly four straight edges (and hence the vertices)
(iv) The tilt of perspective from the top view is not more than 45 degrees.
(v) Maximum portion of the image is covered by the document.
(vi) Nothing overlaps or obstructs the document in the image (like fingers, weights,
etc.).
The task is to enhance the readability of a photographed printed document using
image processing techniques by removing noise, enhancing image and transforming
perspective has broadly 4 essential steps:
(1) Detecting the edges of document
(2) Using edges to find contour of the document
(3) Applying perspective transformation to obtain bird’s eye / top-down view of the
document.
(4) Enhancing image quality to increase readability
11
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 12
Figure 2.1: Basic Flow Diagram of Project
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 13
2.1 Detecting the edges
In an image, an edge is a curve that follows a path of rapid change in image intensity.
Edges are often associated with the boundaries of objects in a scene. Edge detection is
used to identify the edges in an imageFeature Detectors - Sobel Edge Detector (2016).
There are plenty of edge detection techniques, such as Roberts, Prewitt, Sobel and
CannyPrewitt operator (2016) Realtime Computer Vision with OpenCV - ACM Queue
(2016).
According to PSNR, Sobel gives the most quality image while Roberts gives the
least quality imageSingh and Singh (2015). But, in comparison, Canny edge detection
goes a bit further by removing speckle noise with a low pass filter first, then applying
a Sobel filter, and then doing non-maximum suppression to pick out the best pixel for
edges when there are multiple possibilities in a local neighbourhood. So, we choose
Canny edge detection technique.
2.1.1 Pre-processing
Edge detection techniques are susceptible to noise. Camera is a man-made device so it
is not 100% perfect. The raw image i.e. the original image of the document contains
noise which give false results. So we apply a Gaussian filter on the image to smooth
out the noise. The size of the kernel must be chosen accordingly to so that an optimum
amount of noise is filtered out but the edge does not blurred.
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 14
Figure 2.2: Kernel = [1 x 1]
Figure 2.3: Kernel = [3 x 3]
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 15
Figure 2.4: Kernel = [5 x 5]
Figure 2.5: Kernel = [7 x 7]
The optimum value kernel is chosen as (5,5) to avoid loss of details of edge and
remove the noise effectively.
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 16
The Gaussian filter takes input of image with single layer, so we convert the RGB
image to Grayscale.
Code:
image = cv2 . imread ( args [ " image " ] )
gray = cv2 . cvtColor ( image , cv2 .COLOR_BGR2GRAY)
gray = cv2 . GaussianBlur ( gray , (5 , 5) , 0)
2.1.2 Canny Edge Detection
After the image is pre processed, we apply the Canny Edge Detection Wang and Fan
(2009) with minVal= 75 and maxVal=200
Code:
edged = cv2 . Canny ( gray , 75 , 200)
Figure 2.6: Original Image
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 17
Figure 2.7: Canny Edge Detection
2.2 Finding the contour
After identifying the edges, we find the contour of the document. As mentioned earlier,
we assumed that most of the part of the image is occupied by the document and it has
four edges.
We find the contours using cv2.findContours with contour retrieval mode as RETR_LIST.
We only need the contours; we drop the hierarchy. We then keep only the largest five
contours in descending order (assuming most of the part of the image is occupied by
the document) to speed the process by reducing number of iterations.
Now, we count the number of points in the contours (which should be equal to four
according to our assumption) in descending order. If number of points is equal to four,
then we assume that to be contour of our document. Next, we simply highlight the
estimated contour of the document on the original image to verify.
Code:
( cnts , _ ) = cv2 . findContours ( edged . copy ( ) , cv2 . RETR_LIST ,
cv2 . CHAIN_APPROX_SIMPLE)
c n t s = s o r t e d ( cnts , key = cv2 . contourArea , r e v e r s e = True ) [ : 5 ]
f o r c in c o n t o u r s s :
p e r i = cv2 . arcLength ( c , True )
approx = cv2 . approxPolyDP ( c , 0.02 ∗ peri , True )
i f len ( approx ) == 4:
document_contour = approx
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 18
break
cv2 . drawContours ( image , [ document_contour ] , −1, (0 , 255 , 0) , 2)
Figure 2.8: Highlighting contour of document detected in the image
2.3 Perspective Transformation
Now that we have identified the document edges, we now need to apply perspective
transformation to obtain a top view for better readability.
2.3.1 Perspective Transformation Matrix
To perform a perspective transformation, we need to first find out the Transformation
Matrix. For this, we name the four points as top-left, top-right, bottom-left and bottom-
right.
The coordinate top-left point is (0,0). We find the maximum height of the new
image by simply using distance formula (x2 − x1)2 + (y2 − y1)2 between the points
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 19
top-right and bottom-right or top-left and bottom-left, whichever is maximum. Simi-
larly, the maximum width of the new image will be distance between points top-right
and top-left or bottom-right and bottom-left, whichever is maximum. Now, we con-
struct a distance array with values: [0,0], [maxWidth -1,0],[maxWidth-1,maxHeight-
1],[0,maxHeight-1]
Using this array and the original coordinates of the document, find the Perspective
Transformation Matrix using code: M = cv2.getPerspectiveTransform(rect, distance),
where rect = [top-left, top-right, bottom-left,bottom-right]
The coordinates of the pixels of the new image, which we get after perspective
transformation, can be given as (u, v) corresponding to pixels of original image (x, y).
u =
a0x + a1y + a2
c0x + c1y + 1
v =
b0x + b1y + b2
c0x + c1y + 1
where: a0, a1, a2, b0, b1, b2, c0, c1 are constants.
Now, we need to find the value of these constants. To find these we have relations
of 4 pairs i.e. top-left, top=right, bottom-left and bottom-right.
X ∗ A = U
where:
X =


x0 y0 1 0 0 0 −x0u0 −y0u0
x1 y1 1 0 0 0 −x1u1 −y1u1
x2 y2 1 0 0 0 −x2u2 −y2u2
x3 y3 1 0 0 0 −x3u3 −y3u3
0 0 0 x0 y0 1 −x0v0 −y0v0
0 0 0 x1 y1 1 −x1v1 −y1v1
0 0 0 x2 y2 1 −x2v2 −y2v2
0 0 0 x3 y3 1 −x3v3 −y3v3


A =


a0
a1
a2
b0
b1
b2
c0
c1


CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 20
U =


u0
u1
u2
u3
v0
v1
v2
v3


Now, using this Perspective Transformation Matrix we transform the image to top
view.
Code:
r e c t = o r d e r _ p o i n t s ( p t s )
( t l , tr , br , bl ) = r e c t
widthA = np . s q r t ( ( ( br [ 0 ] − bl [ 0 ] ) ∗∗ 2) + ( ( br [ 1 ] − bl [ 1 ] ) ∗∗ 2 ) )
widthB = np . s q r t ( ( ( t r [ 0 ] − t l [ 0 ] ) ∗∗ 2) + ( ( t r [ 1 ] − t l [ 1 ] ) ∗∗ 2 ) )
maxWidth = max ( i n t ( widthA ) , i n t ( widthB ) )
heightA = np . s q r t ( ( ( t r [ 0 ] − br [ 0 ] ) ∗∗ 2) + ( ( t r [ 1 ] − br [ 1 ] ) ∗∗ 2 ) )
heightB = np . s q r t ( ( ( t l [ 0 ] − bl [ 0 ] ) ∗∗ 2) + ( ( t l [ 1 ] − bl [ 1 ] ) ∗∗ 2 ) )
maxHeight = max ( i n t ( heightA ) , i n t ( heightB ) )
d i s t a n c e = np . a r r a y ( [
[0 , 0] ,
[ maxWidth − 1 , 0] ,
[ maxWidth − 1 , maxHeight − 1] ,
[0 , maxHeight − 1 ] ] , dtype = " f l o a t 3 2 " )
M = cv2 . g e t P e r s p e c t i v e T r a n s f o r m ( r ect , d i s t a n c e )
warped = cv2 . warpPerspective ( image , M, ( maxWidth , maxHeight ) )
2.4 Image Enhancement
The last step is image enhancement. The new cropped image with top-view is ready
but we can increase its readability by enhancing the image by making the background
of document(paper) even. Two of the most popular Image Enhancement methods are
Histogram EqualizationTan et al. (2012) and Adaptive ThresholdingPoint Operations -
Adaptive Thresholding (2016) Stark (2000).
Histogram Equalisation, although improves image, but when compared to Adaptive
Thresholding, the latter seems to doing a better job. So, we use adaptive thresholding
to enhance readability of the image of document.
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 21
Figure 2.9: Image obtained after Perspective Trasformation
Figure 2.10: Image enhanced using Histogram Equalization
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 22
Figure 2.11: Image enhanced using Localized Histogram Equalization
Figure 2.12: Image enhanced using Adaptive Thresholding
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 23
Figure 2.13: Detailed flowchart of project
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 24
Figure 2.14: Detailed flowchart of OpenCV code
CHAPTER 3
RESULTS AND DISCUSSION
3.1 RESULTS
The program output the enhanced image for better reading experience by removing the
unwanted background and clear and crisp texts.
3.2 DISCUSSION
Although the adaptive thresholding produces a good readable image but the informa-
tion about colour is lost. This can be hoped to solve by using haze removal algorithm
as the background of the page can be treated as haze which can be subtracted retaining
the colour informationHe et al. (2011).
The program also fails when four edges are not detected. It may pose problem when
document is of different shape or an irregular rectangle. The irregular rectangle prob-
lem can be solved by changing the epsilon value of approxPolyDP function. Also
Hough Transform can also be used detecting linesHough transform - Wikipedia, the
free encyclopedia (2016).
25
CHAPTER 3. RESULTS AND DISCUSSION 26
(a) Original Image (b) Canny Edge Detection
(c) Contour Detection (d) Perspective Transformation
(e) Final Result
Figure 3.1: Result: Image Set 1
CHAPTER 3. RESULTS AND DISCUSSION 27
(a) Original Image (b) Canny Edge Detection
(c) Contour Detection (d) Perspective Transformation
(e) Final Result
Figure 3.2: Result: Document photographed from top
CHAPTER 3. RESULTS AND DISCUSSION 28
(a) Original Image (b) Canny Edge Detection
(c) Contour Detection (d) Perspective Transformation
(e) Final Result
Figure 3.3: Result: Document photographed at an angle
CHAPTER 3. RESULTS AND DISCUSSION 29
(a) Original Image (b) Canny Edge Detection
(c) Contour Detection (d) Perspective Transformation
(e) Final Result
Figure 3.4: Result: Only the largest rectangle is selected
CHAPTER 3. RESULTS AND DISCUSSION 30
(a) Original Image (b) Canny Edge Detection
(c) Contour Detection (d) Perspective Transformation
(e) Final Result
Figure 3.5: Result: Image of Business Card
CHAPTER 3. RESULTS AND DISCUSSION 31
(a) Original Image (b) Canny Edge Detection
(c) Contour Detection (d) Perspective Transformation
(e) Final Result
Figure 3.6: Result: A coloured newspaper clipping
CHAPTER 3. RESULTS AND DISCUSSION 32
(a) Original Image (b) Canny Edge Detection
(c) Contour Detection (d) Perspective Transformation
(e) Final Result
Figure 3.7: Result: Document with tables. Challenging example for finding contour
CHAPTER 3. RESULTS AND DISCUSSION 33
(a) Original Image (b) Canny Edge Detection
Figure 3.8: Result: Document with more than four edges is not detected
CHAPTER 4
CONCLUSION
The readability of digital images of documents (especially Black and White) can be
greatly increased by removing the noise and enhancing image using adaptive threshold.
In future, an algorithm for adaptive thresholding of coloured images can be used to
retain the colour information. The Perspective transformation can also be increased to
a great extent if angle of inclination (can be approximated on the basis of the difference
of edge length of two opposite sides) can is also taken into account. Further, a text
detecting algorithmXiao and Yan (2003) can be used to replace the printed letters with
computer generated text to take readability to next level.
34
REFERENCES
[1] Cheriet, M., Said, J. N. and Suen, C. Y.: 1998, A recursive thresholding technique
for image segmentation, IEEE Transactions on Image Processing 7(6), 918–921.
[2] Chitradevi, B. and P.Srimathi: 1970, An Overview on Image Processing Tech-
niques, International Journal of Innovative Research in Computer and Communi-
cation Engineering .
URL: http://www.rroij.com/abstract.php?abstractid = 47175
[3] Donoser, M. and Bischof, H.: 2006, Efficient Maximally Stable Extremal Region
(MSER) Tracking, 2006 IEEE Computer Society Conference on Computer Vision
and Pattern Recognition (CVPR’06), Vol. 1, pp. 553–560.
[4] Feature Detectors - Sobel Edge Detector: 2016.
URL: http://homepages.inf.ed.ac.uk/rbf/HIPR2/sobel.htm
[5] Forssen, P. E. and Lowe, D. G.: 2007, Shape Descriptors for Maximally Stable
Extremal Regions, 2007 IEEE 11th International Conference on Computer Vision,
pp. 1–8.
[6] He, K., Sun, J. and Tang, X.: 2011, Single Image Haze Removal Using Dark
Channel Prior, IEEE Transactions on Pattern Analysis and Machine Intelligence
33(12), 2341–2353.
[7] Hough transform - Wikipedia, the free encyclopedia: 2016.
URL: https://en.wikipedia.org/wiki/Houghtransform
[8] Image processing: 2016. Page Version ID: 740950083.
URL: https://en.wikipedia.org/w/index.php?title=Imageprocessingoldid =
740950083
[9] OpenCV: 2016. Page Version ID: 739519893.
URL: https://en.wikipedia.org/w/index.php?title=OpenCVoldid=739519893
[10] OpenCV | OpenCV: 2016.
URL: http://opencv.org/
35
REFERENCES 36
[11] Point Operations - Adaptive Thresholding: 2016.
URL: http://homepages.inf.ed.ac.uk/rbf/HIPR2/adpthrsh.htm
[12] Prewitt operator: 2016. Page Version ID: 698509644.
URL: https://en.wikipedia.org/w/index.php?title=Prewitto peratoroldid =
698509644
[13] Rahman, M. H. and Islam, M. R.: 2013, Segmentation of color image using adap-
tive thresholding and masking with watershed algorithm, 2013 International Con-
ference on Informatics, Electronics Vision (ICIEV), pp. 1–6.
[14] Realtime Computer Vision with OpenCV - ACM Queue: 2016.
URL: http://queue.acm.org/detail.cfm?id=2206309
[15] Singh, S. and Singh, R.: 2015, Comparison of various edge detection techniques,
2015 2nd International Conference on Computing for Sustainable Global Devel-
opment (INDIACom), pp. 393–396.
[16] Stark, J. A.: 2000, Adaptive image contrast enhancement using generalizations of
histogram equalization, IEEE Transactions on Image Processing 9(5), 889–896.
[17] Tan, T. L., Sim, K. S. and Tso, C. P.: 2012, Image enhancement using background
brightness preserving histogram equalisation, Electronics Letters 48(3), 155–157.
[18] Wang, B. and Fan, S.: 2009, An Improved CANNY Edge Detection Algorithm,
Second International Workshop on Computer Science and Engineering, 2009.
WCSE ’09, Vol. 1, pp. 497–500.
[19] Wang, Z., Bovik, A. C., Sheikh, H. R. and Simoncelli, E. P.: 2004, Image quality
assessment: from error visibility to structural similarity, IEEE Transactions on
Image Processing 13(4), 600–612.
[20] Xiao, Y. and Yan, H.: 2003, Text region extraction in a document image based on
the Delaunay tessellation, Pattern Recognition 36(3), 799–809.
URL: http://www.sciencedirect.com/science/article/pii/S0031320302000821
[21] Yadav, P.: 2015, Color image noise removal by modified adaptive threshold me-
dian filter for RVIN, 2015 International Conference on Electronic Design, Com-
puter Networks Automated Verification (EDCAV), pp. 175–180.

More Related Content

What's hot (19)

PDF
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Cooper Wakefield
 
PDF
exjobb Telia
Per.Nystedt
 
PDF
report
Mubarak Haruna
 
PDF
Trade-off between recognition an reconstruction: Application of Robotics Visi...
stainvai
 
PDF
ubc_2014_spring_dewancker_ian (9)
Ian Dewancker
 
PDF
Computer security using machine learning
Sandeep Sabnani
 
PDF
Glow_rapport
Chady Dimachkie
 
PDF
Au anthea-ws-201011-ma sc-thesis
evegod
 
PDF
Image processing tutorial
Jehovah Jireh Softwares
 
PDF
Interactive Filtering Algorithm - George Jenkins 2014
George Jenkins
 
PPT
Section 2 M Vision Geometry Calibration V Mc 062707 V Rjo062807
Richard O'Keeffe
 
PDF
Yamamoto.2011.hyakunin eyesshu a tabletop hyakunin-isshu game with computer o...
mrgazer
 
PDF
report
Arthur Ceccotti
 
PDF
Master thesis
Dhara Shah
 
PPT
Dip
Varun Raj
 
PPT
Image processing
Varun Raj
 
PPTX
Spandana image processing and compression techniques (7840228)
indianspandana
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Cooper Wakefield
 
exjobb Telia
Per.Nystedt
 
Trade-off between recognition an reconstruction: Application of Robotics Visi...
stainvai
 
ubc_2014_spring_dewancker_ian (9)
Ian Dewancker
 
Computer security using machine learning
Sandeep Sabnani
 
Glow_rapport
Chady Dimachkie
 
Au anthea-ws-201011-ma sc-thesis
evegod
 
Image processing tutorial
Jehovah Jireh Softwares
 
Interactive Filtering Algorithm - George Jenkins 2014
George Jenkins
 
Section 2 M Vision Geometry Calibration V Mc 062707 V Rjo062807
Richard O'Keeffe
 
Yamamoto.2011.hyakunin eyesshu a tabletop hyakunin-isshu game with computer o...
mrgazer
 
Master thesis
Dhara Shah
 
Image processing
Varun Raj
 
Spandana image processing and compression techniques (7840228)
indianspandana
 

Similar to Enhancing readability of digital image using image processing - Full Report (20)

PPT
Introduction to Digital Image Processing with MATLAB Asia Edition.ppt
Paulo Opina
 
PDF
Final Report for project
Rajarshi Roy
 
PPTX
Image enhancement lecture
ISRAR HUSSAIN
 
PDF
IRJET- Proposed Approach for Layout & Handwritten Character Recognization in OCR
IRJET Journal
 
PPTX
image processing using matlab in faculty 1
saranraj559527
 
PPT
Image enhancement ppt nal2
Surabhi Ks
 
PPTX
Presentation-lokesh IMAGES for research.pptx
bhargavi804095
 
PPTX
IMAGE PROCESSING.pptx
ChaitanyaKhandekar
 
PDF
A review on digital image processing paper
Charlie716895
 
PDF
IRJET- Lane Segmentation for Self-Driving Cars using Image Processing
IRJET Journal
 
PDF
Content Based Image Compression Using Dct And Dwt Technique
Heidi Maestas
 
PPTX
image processing powerpoint presentation for 6th sem
ShyamKumarSahu2
 
PDF
General Review Of Algorithms Presented For Image Segmentation
Melissa Moore
 
PDF
Digital image processing
parul4d
 
PDF
IRJET - Simulation of Colour Image Processing Techniques on VHDL
IRJET Journal
 
PPTX
Image processing
kamal330
 
PPTX
Digital image processing Tool presentation
dikshabehl5392
 
PDF
Analysis and Comparison of various Methods for Text Detection from Images usi...
rahulmonikasharma
 
DOCX
newcgreppainnewcgreppainnewcgreppainnewcgreppainnewcgreppain
LekhakXerox
 
Introduction to Digital Image Processing with MATLAB Asia Edition.ppt
Paulo Opina
 
Final Report for project
Rajarshi Roy
 
Image enhancement lecture
ISRAR HUSSAIN
 
IRJET- Proposed Approach for Layout & Handwritten Character Recognization in OCR
IRJET Journal
 
image processing using matlab in faculty 1
saranraj559527
 
Image enhancement ppt nal2
Surabhi Ks
 
Presentation-lokesh IMAGES for research.pptx
bhargavi804095
 
IMAGE PROCESSING.pptx
ChaitanyaKhandekar
 
A review on digital image processing paper
Charlie716895
 
IRJET- Lane Segmentation for Self-Driving Cars using Image Processing
IRJET Journal
 
Content Based Image Compression Using Dct And Dwt Technique
Heidi Maestas
 
image processing powerpoint presentation for 6th sem
ShyamKumarSahu2
 
General Review Of Algorithms Presented For Image Segmentation
Melissa Moore
 
Digital image processing
parul4d
 
IRJET - Simulation of Colour Image Processing Techniques on VHDL
IRJET Journal
 
Image processing
kamal330
 
Digital image processing Tool presentation
dikshabehl5392
 
Analysis and Comparison of various Methods for Text Detection from Images usi...
rahulmonikasharma
 
newcgreppainnewcgreppainnewcgreppainnewcgreppainnewcgreppain
LekhakXerox
 
Ad

Recently uploaded (20)

PPTX
Heart Bleed Bug - A case study (Course: Cryptography and Network Security)
Adri Jovin
 
PDF
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
PPTX
Electron Beam Machining for Production Process
Rajshahi University of Engineering & Technology(RUET), Bangladesh
 
PPTX
Hashing Introduction , hash functions and techniques
sailajam21
 
PDF
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
PPTX
MPMC_Module-2 xxxxxxxxxxxxxxxxxxxxx.pptx
ShivanshVaidya5
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PPTX
Structural Functiona theory this important for the theorist
cagumaydanny26
 
PDF
ARC--BUILDING-UTILITIES-2-PART-2 (1).pdf
IzzyBaniquedBusto
 
PDF
monopile foundation seminar topic for civil engineering students
Ahina5
 
PDF
PRIZ Academy - Change Flow Thinking Master Change with Confidence.pdf
PRIZ Guru
 
PPTX
Green Building & Energy Conservation ppt
Sagar Sarangi
 
PPTX
EC3551-Transmission lines Demo class .pptx
Mahalakshmiprasannag
 
PPTX
MobileComputingMANET2023 MobileComputingMANET2023.pptx
masterfake98765
 
PPTX
Pharmaceuticals and fine chemicals.pptxx
jaypa242004
 
PPTX
REINFORCEMENT AS CONSTRUCTION MATERIALS.pptx
mohaiminulhaquesami
 
PDF
Additional Information in midterm CPE024 (1).pdf
abolisojoy
 
PPTX
Thermal runway and thermal stability.pptx
godow93766
 
PPTX
site survey architecture student B.arch.
sri02032006
 
PDF
POWER PLANT ENGINEERING (R17A0326).pdf..
haneefachosa123
 
Heart Bleed Bug - A case study (Course: Cryptography and Network Security)
Adri Jovin
 
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
Electron Beam Machining for Production Process
Rajshahi University of Engineering & Technology(RUET), Bangladesh
 
Hashing Introduction , hash functions and techniques
sailajam21
 
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
MPMC_Module-2 xxxxxxxxxxxxxxxxxxxxx.pptx
ShivanshVaidya5
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
Structural Functiona theory this important for the theorist
cagumaydanny26
 
ARC--BUILDING-UTILITIES-2-PART-2 (1).pdf
IzzyBaniquedBusto
 
monopile foundation seminar topic for civil engineering students
Ahina5
 
PRIZ Academy - Change Flow Thinking Master Change with Confidence.pdf
PRIZ Guru
 
Green Building & Energy Conservation ppt
Sagar Sarangi
 
EC3551-Transmission lines Demo class .pptx
Mahalakshmiprasannag
 
MobileComputingMANET2023 MobileComputingMANET2023.pptx
masterfake98765
 
Pharmaceuticals and fine chemicals.pptxx
jaypa242004
 
REINFORCEMENT AS CONSTRUCTION MATERIALS.pptx
mohaiminulhaquesami
 
Additional Information in midterm CPE024 (1).pdf
abolisojoy
 
Thermal runway and thermal stability.pptx
godow93766
 
site survey architecture student B.arch.
sri02032006
 
POWER PLANT ENGINEERING (R17A0326).pdf..
haneefachosa123
 
Ad

Enhancing readability of digital image using image processing - Full Report

  • 1. ENHANCING READABILITY OF DIGITAL IMAGE USING IMAGE PROCESSING TECHNIQUES A project report submitted in partial fulfillment of the requirements for B.Tech. Project B.Tech. by Akshay Kumar Lodha (2013IPG-007) Sudesh Sindoskar (2013IPG-109) Upendra Singh Sachan (2013IPG-118) ABV INDIAN INSTITUTE OF INFORMATION TECHNOLOGY AND MANAGEMENT GWALIOR-474 010 2016
  • 2. 1 ABSTRACT Smartphones have penetrated everyone’s lifestyle. With advance of technology, the smartphones have become very powerful computers with very small form-factor. The camera is an essential feature of every smartphone. We tend to digitize the printed documents by clicking their photograph using the smartphone so that we can have a digital backup for future reference or for ease of portability while reading on-the-go. But this comes with added costs like: (1) Reduced image quality results in reduced readability (2) Noises like unwanted background keeps distracting the reader (3) The uneven texture of photograph due to different lighting condition like shadows, etc. As reading is an experience, readers want a noiseless environment. These things dis- tracts users from reading and hence reduce their concentration and attention span. We can use different image processing techniques which can greatly increase the readabil- ity of the document by removing/reducing noise and enhancing image.
  • 3. TABLE OF CONTENTS ABSTRACT 1 LIST OF FIGURES 3 1 INTRODUCTION AND LITERATURE SURVEY 5 1.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.1 Image Processing Techniques . . . . . . . . . . . . . . . . . . 6 1.1.1.1 Image Pre-processing . . . . . . . . . . . . . . . . . 7 1.1.1.2 Image Enhancement . . . . . . . . . . . . . . . . . 7 1.1.1.3 Image Segmentation . . . . . . . . . . . . . . . . . 7 1.1.1.4 Feature Extraction . . . . . . . . . . . . . . . . . . 7 1.1.1.5 Image Classification . . . . . . . . . . . . . . . . . 7 1.1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2 OpenCV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 DESIGN DETAILS AND IMPLEMENTATION 11 2.1 Detecting the edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.1.1 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.1.2 Canny Edge Detection . . . . . . . . . . . . . . . . . . . . . . 16 2.2 Finding the contour . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3 Perspective Transformation . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3.1 Perspective Transformation Matrix . . . . . . . . . . . . . . . . 18 2.4 Image Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3 RESULTS AND DISCUSSION 25 3.1 RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 DISCUSSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4 CONCLUSION 34 2
  • 4. TABLE OF CONTENTS 3 REFERENCES 34
  • 5. LIST OF FIGURES 2.1 Basic Flow Diagram of Project . . . . . . . . . . . . . . . . . . . . . . 12 2.2 Kernel = [1 x 1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Kernel = [3 x 3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4 Kernel = [5 x 5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5 Kernel = [7 x 7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.6 Original Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.7 Canny Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.8 Highlighting contour of document detected in the image . . . . . . . . 18 2.9 Image obtained after Perspective Trasformation . . . . . . . . . . . . . 21 2.10 Image enhanced using Histogram Equalization . . . . . . . . . . . . . 21 2.11 Image enhanced using Localized Histogram Equalization . . . . . . . . 22 2.12 Image enhanced using Adaptive Thresholding . . . . . . . . . . . . . . 22 2.13 Detailed flowchart of project . . . . . . . . . . . . . . . . . . . . . . . 23 2.14 Detailed flowchart of OpenCV code . . . . . . . . . . . . . . . . . . . 24 3.1 Result: Image Set 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2 Result: Document photographed from top . . . . . . . . . . . . . . . . 27 3.3 Result: Document photographed at an angle . . . . . . . . . . . . . . . 28 3.4 Result: Only the largest rectangle is selected . . . . . . . . . . . . . . . 29 3.5 Result: Image of Business Card . . . . . . . . . . . . . . . . . . . . . 30 3.6 Result: A coloured newspaper clipping . . . . . . . . . . . . . . . . . . 31 3.7 Result: Document with tables. Challenging example for finding contour 32 3.8 Result: Document with more than four edges is not detected . . . . . . 33 4
  • 6. CHAPTER 1 INTRODUCTION AND LITERATURE SURVEY This chapter includes the details of image processing techniques, problem statement, platform used to implement the project and literature reviews related to work done in this field. 1.1 INTRODUCTION One of the most important devices one can use to help their life is a smartphone. There is so much that you can do with a smartphone and so many different ways in which they play a key role in your life. It enhance and advance one’s personal life and working life. With the advance of technology, the cost of processing speed is greatly reduced and smartphones nowadays are nothing but a powerful full fledged computer with very small form-factor. Now, everyone tend to work using their smartphone. So, everything is going digital. But still, much of the information we access everyday is in printed form. So, we digitize this by clicking the photograph of these documents and save them in our smartphone for future reference or anytime access or reading them on their smartphones. But this comes with some drawbacks such as: (a) Images clicked are dependent on the quality of camera of the smartphone and hence reducing the image quality which results in reduced readability (b) Noise like unwanted background keeps distracting the reader (c) The shadows projected over the document, when we try to click photograph from top, reduces readability (d) Tilted photographs tend to put extra pressure on reader’s mind 5
  • 7. CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY 6 Reading is an experience. The external factors which effects the reading speed of a reader are readability of document (presentation) and concentration. The noises like unwanted background, difficult to read image, etc. distracts the reader which decreases their concentration and attention span. For making a document more readable and presentable, we can do the following: (a) Remove unwanted background (b) Transform perspective of document to Top View (c) Enhance the image by making text more readable and background less visible. 1.1.1 Image Processing Techniques The definition of image processing is to processing of a digital image, i.e. extracting the noise and any kind of irregularities available in an image using the digital computer. The irregularity or noise may creep into the digital image either during its formation or during its transformation etc. For mathematical analysis, an image may be elucidated as a 2D function f(x,y) here x and y are spatial (plane) coordinates, and the amplitude of f at any set of coordinates (x, y) is used as the intensity or gray level of the digital image at that point. When x, y, and the intensity values of f are all limited, discrete quantities, we can call the image a digital image. It is very significant that a digital image is made of a finite number of elements, each one of them has a different value and location. These elements are called image elements or pixels. Pixel is the most frequently used term to describe the elements of a digital imageImage processing (2016). Numerous techniques have been introduced in Image Processing during the last four to five decades. Most of them are developed for enhancing images obtained from mili- tary investigation flights, unmanned space crafts and space research. Image Processing techniques are favoured due to easy accessibility of powerful computers, graphics soft- ware, large size memory devices, etc. The different Image Processing techniques are: (i) Image pre-processing (ii) Image enhancement (iii) Image segmentation (iv) Feature extraction (v) Image classification
  • 8. CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY 7 1.1.1.1 Image Pre-processing In image pre-processing, image data registered by sensors or cameras prevent errors associated with geometry and brightness values of the pixels. These errors are improved using suitable mathematical models which are either definite or statistical models. 1.1.1.2 Image Enhancement Image enhancement is the improvisation of digital image by changing the pixel bright- ness values to modify its visual impact. Image enhancement includes a collection of techniques that are used to modify the visual appearance of a digital image, or to change the image to a form which is better suited for human or machine interpretationWang et al. (2004). 1.1.1.3 Image Segmentation Segmentation is one of the main problems in image processing. Image segmentation is the method that classifies a digital image into its integral parts or objects. The depth to which this subdivision or classification is carried out lies on the problem being solved, i.e., the segmentation should terminate when the objects on interest in an application have been isolated e.g., in independent air-to-ground target accession, assume our in- terest is to distinguish vehicles on a road, the first step is to segment the road from the digital image and then to subdivide the objects of the road down to probable vehicles. Image thresholding methods are used for image segmentation. 1.1.1.4 Feature Extraction The feature extraction methods are developed to extract features in synthetic aperture radar images. This method withdraws high-level features required in order to conduct classification of targets. Features are those items which individually narrate a target, such as dimension, location, structure, composition etc. Segmentation methods are applied to extract the wanted object from the scene so that measurements can be made on it eventually. Quantitative measurements of object feature permit classification and description of the image. 1.1.1.5 Image Classification Image classification is the marking of a pixel or a group of pixels on the basis of its grey value. Image classification is frequently used method of information extraction. In Classification, mostly multiple features are implemented in a set of pixels i.e., many images of a specific object are required.
  • 9. CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY 8 1.1.2 Problem Statement To enhance the readability of a photographed printed document using image processing techniques by removing noise, enhancing image and transforming perspective. 1.1.3 Motivation The main focus of this project is to develop a system to increase readability of the digital image of a document by using image processing and computer vision. As reading is all about experience, the noise free document is a must for every reader. 1.2 OpenCV We have used OpenCV using Python interface to implement our project. OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real-time computer vision. OpenCV | OpenCV (2016) OpenCV is aimed at providing the tools needed to solve computer-vision problems. It contains a mix of low-level image-processing functions and high-level algorithms such as face detection, pedestrian detection, feature matching, and tracking. The library has been downloaded more than 3 million times. OpenCV (2016) 1.3 Literature Review We mainly deal with the literatures about different image processing algorithms cover- ing topics in Image Enhancement, Feature extraction and Image Segmentation. -B. Chitradevi and P.Srimathi (2014) gives an overview on Image Processing Tech- niques in their report. The preservation of original data precision, versatility and re- peatability are thr principle advantages of Digital Image Processing. The various image processing techniques are Image preprocessing, image enhancement, image segmenta- tion, feature extraction and image classification. The image enhancement algorithms are interactive and application dependent. Some image enhancement techniques are Contrast stretching, Noise filtering, Histogram modification. Chitradevi and P.Srimathi (1970) -Bing Wang and ShaoSheng Fan (2009) proposed that traditional Canny Edge detection algorithm has the defect that being vulnerable to various noise disturbances. It can be greatly improved by replacing the pre-processing step Gaussian filter with self-adaptive filter and further improved by morphological thinning to thin the edge. Wang and Fan
  • 10. CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY 9 (2009) The use of Gaussian filter in traditional algorithm posed a problem by not only smooth- ing the noise but smoothing the edges also, so some edge is weakened. The difficulty to determine the Dual-threshold can’t remove the noise or makes lose some edge. -In the work by T.L.Tan, K.S. Sim and C.P.Tso (2012), a new approach of nonlin- ear Histogram Equalization is presented, which is not only able to enhance the image contrast but it also preserves the background brightness. Tan et al. (2012) -In the work by Kaiming He, Jian Sun, Xiaoou Tang (2009), they have proposed a simple but effective image prior called dark channel prior to remove the haze from a single input image. The dark channel prior is a kind of statistics of the haze-free out- door images based on a key observation that most local patches in haze-free outdoor images contain some pixels which have very low intensities in at least one color chan- nel. Using this prior with the haze imaging model, they directly estimated the thickness of haze and recover high quality haze-free image. He et al. (2011) -Michael Donoser and Horst Bischof (2006) proposed a novel concept for tracking of Maximally Stable Extremal Regions (MSERs). The approach uses temporal informa- tion to improve the computational time and detect stability of single MSER. This can be widely used in tracking license plates, face tracking, segmenting the fiber network, etc. Donoser and Bischof (2006) -Per-Erik Forssen and David G.Lowe (2007) introduced an affine invariant shape de- scriptor for maximally stable extremal regions. The descriptor is computed using the scale invariant feature transform (SIFT), with re-sampled MSER binary mask as input. This provides better robustness to illumination change and nearby occlusions than ex- isting methods. Forssen and Lowe (2007) -In the work by M.Cheriet, J.N. Said, C.Y. Suen (1998), the presented a general recur- sive thresholding technique for image segmentation by extending Otsu’s method. This approach of image segmentation has been implemented in the scope of document im- ages, specifically real-life bank cheques. At each recursion, the new approach segments first the object with lowest intensity from the given image and this process continues until there is one object left in the image. Cheriet et al. (1998) -In the work done by Pranay Yadav (2015), proposed a method for noise removal in coloured image by modified adaptive threshold median filter. The proposed filter is very effective for random valued impulse noise because practically noise is not uniform
  • 11. CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY 10 over the channel. He used concept of both minimum and maximum thresholds to detect the positive and negative noise. The threshold values are different for different noise density. Yadav (2015) - Md. Habibur Rahman and Md. Rafiqul Islam (2013) proposed a modified version of the watershed algorithm for image segmentation. The overcome the problem of over segmentation by introducing adaptive masking and a thresholding mechanism over each color channel befor combining the segmentation from each channel into the final one. The proposed modified watershed algorithm is also faster than many other segmenta- tion algorithms, making it fit for real-time application. Rahman and Islam (2013)
  • 12. CHAPTER 2 DESIGN DETAILS AND IMPLEMENTATION To begin, first we need to thoroughly analyse the properties of document that we need to extract from the image. The typical properties and assumptions of image of a document are: (i) The image clicked is clear enough to read the contents of the document. (ii) Generally, the content (text) is in black ink while the background (paper) is white coloured. (iii) The document has exactly four straight edges (and hence the vertices) (iv) The tilt of perspective from the top view is not more than 45 degrees. (v) Maximum portion of the image is covered by the document. (vi) Nothing overlaps or obstructs the document in the image (like fingers, weights, etc.). The task is to enhance the readability of a photographed printed document using image processing techniques by removing noise, enhancing image and transforming perspective has broadly 4 essential steps: (1) Detecting the edges of document (2) Using edges to find contour of the document (3) Applying perspective transformation to obtain bird’s eye / top-down view of the document. (4) Enhancing image quality to increase readability 11
  • 13. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 12 Figure 2.1: Basic Flow Diagram of Project
  • 14. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 13 2.1 Detecting the edges In an image, an edge is a curve that follows a path of rapid change in image intensity. Edges are often associated with the boundaries of objects in a scene. Edge detection is used to identify the edges in an imageFeature Detectors - Sobel Edge Detector (2016). There are plenty of edge detection techniques, such as Roberts, Prewitt, Sobel and CannyPrewitt operator (2016) Realtime Computer Vision with OpenCV - ACM Queue (2016). According to PSNR, Sobel gives the most quality image while Roberts gives the least quality imageSingh and Singh (2015). But, in comparison, Canny edge detection goes a bit further by removing speckle noise with a low pass filter first, then applying a Sobel filter, and then doing non-maximum suppression to pick out the best pixel for edges when there are multiple possibilities in a local neighbourhood. So, we choose Canny edge detection technique. 2.1.1 Pre-processing Edge detection techniques are susceptible to noise. Camera is a man-made device so it is not 100% perfect. The raw image i.e. the original image of the document contains noise which give false results. So we apply a Gaussian filter on the image to smooth out the noise. The size of the kernel must be chosen accordingly to so that an optimum amount of noise is filtered out but the edge does not blurred.
  • 15. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 14 Figure 2.2: Kernel = [1 x 1] Figure 2.3: Kernel = [3 x 3]
  • 16. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 15 Figure 2.4: Kernel = [5 x 5] Figure 2.5: Kernel = [7 x 7] The optimum value kernel is chosen as (5,5) to avoid loss of details of edge and remove the noise effectively.
  • 17. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 16 The Gaussian filter takes input of image with single layer, so we convert the RGB image to Grayscale. Code: image = cv2 . imread ( args [ " image " ] ) gray = cv2 . cvtColor ( image , cv2 .COLOR_BGR2GRAY) gray = cv2 . GaussianBlur ( gray , (5 , 5) , 0) 2.1.2 Canny Edge Detection After the image is pre processed, we apply the Canny Edge Detection Wang and Fan (2009) with minVal= 75 and maxVal=200 Code: edged = cv2 . Canny ( gray , 75 , 200) Figure 2.6: Original Image
  • 18. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 17 Figure 2.7: Canny Edge Detection 2.2 Finding the contour After identifying the edges, we find the contour of the document. As mentioned earlier, we assumed that most of the part of the image is occupied by the document and it has four edges. We find the contours using cv2.findContours with contour retrieval mode as RETR_LIST. We only need the contours; we drop the hierarchy. We then keep only the largest five contours in descending order (assuming most of the part of the image is occupied by the document) to speed the process by reducing number of iterations. Now, we count the number of points in the contours (which should be equal to four according to our assumption) in descending order. If number of points is equal to four, then we assume that to be contour of our document. Next, we simply highlight the estimated contour of the document on the original image to verify. Code: ( cnts , _ ) = cv2 . findContours ( edged . copy ( ) , cv2 . RETR_LIST , cv2 . CHAIN_APPROX_SIMPLE) c n t s = s o r t e d ( cnts , key = cv2 . contourArea , r e v e r s e = True ) [ : 5 ] f o r c in c o n t o u r s s : p e r i = cv2 . arcLength ( c , True ) approx = cv2 . approxPolyDP ( c , 0.02 ∗ peri , True ) i f len ( approx ) == 4: document_contour = approx
  • 19. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 18 break cv2 . drawContours ( image , [ document_contour ] , −1, (0 , 255 , 0) , 2) Figure 2.8: Highlighting contour of document detected in the image 2.3 Perspective Transformation Now that we have identified the document edges, we now need to apply perspective transformation to obtain a top view for better readability. 2.3.1 Perspective Transformation Matrix To perform a perspective transformation, we need to first find out the Transformation Matrix. For this, we name the four points as top-left, top-right, bottom-left and bottom- right. The coordinate top-left point is (0,0). We find the maximum height of the new image by simply using distance formula (x2 − x1)2 + (y2 − y1)2 between the points
  • 20. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 19 top-right and bottom-right or top-left and bottom-left, whichever is maximum. Simi- larly, the maximum width of the new image will be distance between points top-right and top-left or bottom-right and bottom-left, whichever is maximum. Now, we con- struct a distance array with values: [0,0], [maxWidth -1,0],[maxWidth-1,maxHeight- 1],[0,maxHeight-1] Using this array and the original coordinates of the document, find the Perspective Transformation Matrix using code: M = cv2.getPerspectiveTransform(rect, distance), where rect = [top-left, top-right, bottom-left,bottom-right] The coordinates of the pixels of the new image, which we get after perspective transformation, can be given as (u, v) corresponding to pixels of original image (x, y). u = a0x + a1y + a2 c0x + c1y + 1 v = b0x + b1y + b2 c0x + c1y + 1 where: a0, a1, a2, b0, b1, b2, c0, c1 are constants. Now, we need to find the value of these constants. To find these we have relations of 4 pairs i.e. top-left, top=right, bottom-left and bottom-right. X ∗ A = U where: X =   x0 y0 1 0 0 0 −x0u0 −y0u0 x1 y1 1 0 0 0 −x1u1 −y1u1 x2 y2 1 0 0 0 −x2u2 −y2u2 x3 y3 1 0 0 0 −x3u3 −y3u3 0 0 0 x0 y0 1 −x0v0 −y0v0 0 0 0 x1 y1 1 −x1v1 −y1v1 0 0 0 x2 y2 1 −x2v2 −y2v2 0 0 0 x3 y3 1 −x3v3 −y3v3   A =   a0 a1 a2 b0 b1 b2 c0 c1  
  • 21. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 20 U =   u0 u1 u2 u3 v0 v1 v2 v3   Now, using this Perspective Transformation Matrix we transform the image to top view. Code: r e c t = o r d e r _ p o i n t s ( p t s ) ( t l , tr , br , bl ) = r e c t widthA = np . s q r t ( ( ( br [ 0 ] − bl [ 0 ] ) ∗∗ 2) + ( ( br [ 1 ] − bl [ 1 ] ) ∗∗ 2 ) ) widthB = np . s q r t ( ( ( t r [ 0 ] − t l [ 0 ] ) ∗∗ 2) + ( ( t r [ 1 ] − t l [ 1 ] ) ∗∗ 2 ) ) maxWidth = max ( i n t ( widthA ) , i n t ( widthB ) ) heightA = np . s q r t ( ( ( t r [ 0 ] − br [ 0 ] ) ∗∗ 2) + ( ( t r [ 1 ] − br [ 1 ] ) ∗∗ 2 ) ) heightB = np . s q r t ( ( ( t l [ 0 ] − bl [ 0 ] ) ∗∗ 2) + ( ( t l [ 1 ] − bl [ 1 ] ) ∗∗ 2 ) ) maxHeight = max ( i n t ( heightA ) , i n t ( heightB ) ) d i s t a n c e = np . a r r a y ( [ [0 , 0] , [ maxWidth − 1 , 0] , [ maxWidth − 1 , maxHeight − 1] , [0 , maxHeight − 1 ] ] , dtype = " f l o a t 3 2 " ) M = cv2 . g e t P e r s p e c t i v e T r a n s f o r m ( r ect , d i s t a n c e ) warped = cv2 . warpPerspective ( image , M, ( maxWidth , maxHeight ) ) 2.4 Image Enhancement The last step is image enhancement. The new cropped image with top-view is ready but we can increase its readability by enhancing the image by making the background of document(paper) even. Two of the most popular Image Enhancement methods are Histogram EqualizationTan et al. (2012) and Adaptive ThresholdingPoint Operations - Adaptive Thresholding (2016) Stark (2000). Histogram Equalisation, although improves image, but when compared to Adaptive Thresholding, the latter seems to doing a better job. So, we use adaptive thresholding to enhance readability of the image of document.
  • 22. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 21 Figure 2.9: Image obtained after Perspective Trasformation Figure 2.10: Image enhanced using Histogram Equalization
  • 23. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 22 Figure 2.11: Image enhanced using Localized Histogram Equalization Figure 2.12: Image enhanced using Adaptive Thresholding
  • 24. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 23 Figure 2.13: Detailed flowchart of project
  • 25. CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION 24 Figure 2.14: Detailed flowchart of OpenCV code
  • 26. CHAPTER 3 RESULTS AND DISCUSSION 3.1 RESULTS The program output the enhanced image for better reading experience by removing the unwanted background and clear and crisp texts. 3.2 DISCUSSION Although the adaptive thresholding produces a good readable image but the informa- tion about colour is lost. This can be hoped to solve by using haze removal algorithm as the background of the page can be treated as haze which can be subtracted retaining the colour informationHe et al. (2011). The program also fails when four edges are not detected. It may pose problem when document is of different shape or an irregular rectangle. The irregular rectangle prob- lem can be solved by changing the epsilon value of approxPolyDP function. Also Hough Transform can also be used detecting linesHough transform - Wikipedia, the free encyclopedia (2016). 25
  • 27. CHAPTER 3. RESULTS AND DISCUSSION 26 (a) Original Image (b) Canny Edge Detection (c) Contour Detection (d) Perspective Transformation (e) Final Result Figure 3.1: Result: Image Set 1
  • 28. CHAPTER 3. RESULTS AND DISCUSSION 27 (a) Original Image (b) Canny Edge Detection (c) Contour Detection (d) Perspective Transformation (e) Final Result Figure 3.2: Result: Document photographed from top
  • 29. CHAPTER 3. RESULTS AND DISCUSSION 28 (a) Original Image (b) Canny Edge Detection (c) Contour Detection (d) Perspective Transformation (e) Final Result Figure 3.3: Result: Document photographed at an angle
  • 30. CHAPTER 3. RESULTS AND DISCUSSION 29 (a) Original Image (b) Canny Edge Detection (c) Contour Detection (d) Perspective Transformation (e) Final Result Figure 3.4: Result: Only the largest rectangle is selected
  • 31. CHAPTER 3. RESULTS AND DISCUSSION 30 (a) Original Image (b) Canny Edge Detection (c) Contour Detection (d) Perspective Transformation (e) Final Result Figure 3.5: Result: Image of Business Card
  • 32. CHAPTER 3. RESULTS AND DISCUSSION 31 (a) Original Image (b) Canny Edge Detection (c) Contour Detection (d) Perspective Transformation (e) Final Result Figure 3.6: Result: A coloured newspaper clipping
  • 33. CHAPTER 3. RESULTS AND DISCUSSION 32 (a) Original Image (b) Canny Edge Detection (c) Contour Detection (d) Perspective Transformation (e) Final Result Figure 3.7: Result: Document with tables. Challenging example for finding contour
  • 34. CHAPTER 3. RESULTS AND DISCUSSION 33 (a) Original Image (b) Canny Edge Detection Figure 3.8: Result: Document with more than four edges is not detected
  • 35. CHAPTER 4 CONCLUSION The readability of digital images of documents (especially Black and White) can be greatly increased by removing the noise and enhancing image using adaptive threshold. In future, an algorithm for adaptive thresholding of coloured images can be used to retain the colour information. The Perspective transformation can also be increased to a great extent if angle of inclination (can be approximated on the basis of the difference of edge length of two opposite sides) can is also taken into account. Further, a text detecting algorithmXiao and Yan (2003) can be used to replace the printed letters with computer generated text to take readability to next level. 34
  • 36. REFERENCES [1] Cheriet, M., Said, J. N. and Suen, C. Y.: 1998, A recursive thresholding technique for image segmentation, IEEE Transactions on Image Processing 7(6), 918–921. [2] Chitradevi, B. and P.Srimathi: 1970, An Overview on Image Processing Tech- niques, International Journal of Innovative Research in Computer and Communi- cation Engineering . URL: http://www.rroij.com/abstract.php?abstractid = 47175 [3] Donoser, M. and Bischof, H.: 2006, Efficient Maximally Stable Extremal Region (MSER) Tracking, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 1, pp. 553–560. [4] Feature Detectors - Sobel Edge Detector: 2016. URL: http://homepages.inf.ed.ac.uk/rbf/HIPR2/sobel.htm [5] Forssen, P. E. and Lowe, D. G.: 2007, Shape Descriptors for Maximally Stable Extremal Regions, 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8. [6] He, K., Sun, J. and Tang, X.: 2011, Single Image Haze Removal Using Dark Channel Prior, IEEE Transactions on Pattern Analysis and Machine Intelligence 33(12), 2341–2353. [7] Hough transform - Wikipedia, the free encyclopedia: 2016. URL: https://en.wikipedia.org/wiki/Houghtransform [8] Image processing: 2016. Page Version ID: 740950083. URL: https://en.wikipedia.org/w/index.php?title=Imageprocessingoldid = 740950083 [9] OpenCV: 2016. Page Version ID: 739519893. URL: https://en.wikipedia.org/w/index.php?title=OpenCVoldid=739519893 [10] OpenCV | OpenCV: 2016. URL: http://opencv.org/ 35
  • 37. REFERENCES 36 [11] Point Operations - Adaptive Thresholding: 2016. URL: http://homepages.inf.ed.ac.uk/rbf/HIPR2/adpthrsh.htm [12] Prewitt operator: 2016. Page Version ID: 698509644. URL: https://en.wikipedia.org/w/index.php?title=Prewitto peratoroldid = 698509644 [13] Rahman, M. H. and Islam, M. R.: 2013, Segmentation of color image using adap- tive thresholding and masking with watershed algorithm, 2013 International Con- ference on Informatics, Electronics Vision (ICIEV), pp. 1–6. [14] Realtime Computer Vision with OpenCV - ACM Queue: 2016. URL: http://queue.acm.org/detail.cfm?id=2206309 [15] Singh, S. and Singh, R.: 2015, Comparison of various edge detection techniques, 2015 2nd International Conference on Computing for Sustainable Global Devel- opment (INDIACom), pp. 393–396. [16] Stark, J. A.: 2000, Adaptive image contrast enhancement using generalizations of histogram equalization, IEEE Transactions on Image Processing 9(5), 889–896. [17] Tan, T. L., Sim, K. S. and Tso, C. P.: 2012, Image enhancement using background brightness preserving histogram equalisation, Electronics Letters 48(3), 155–157. [18] Wang, B. and Fan, S.: 2009, An Improved CANNY Edge Detection Algorithm, Second International Workshop on Computer Science and Engineering, 2009. WCSE ’09, Vol. 1, pp. 497–500. [19] Wang, Z., Bovik, A. C., Sheikh, H. R. and Simoncelli, E. P.: 2004, Image quality assessment: from error visibility to structural similarity, IEEE Transactions on Image Processing 13(4), 600–612. [20] Xiao, Y. and Yan, H.: 2003, Text region extraction in a document image based on the Delaunay tessellation, Pattern Recognition 36(3), 799–809. URL: http://www.sciencedirect.com/science/article/pii/S0031320302000821 [21] Yadav, P.: 2015, Color image noise removal by modified adaptive threshold me- dian filter for RVIN, 2015 International Conference on Electronic Design, Com- puter Networks Automated Verification (EDCAV), pp. 175–180.