图片方向矫正

原创已于 2023-01-05 12:55:24 修改 · 1k 阅读

8 ·

CC 4.0 BY-SA版权

文章标签：

#opencv #python #numpy

于 2022-11-09 20:15:50 首次发布

OCR 专栏收录该内容

9 篇文章

订阅专栏

本文探讨了利用Radon变换进行文本行倾斜角检测的技术，通过实例展示了如何使用Python库实现图像的几何校正，并对比了DocUNet和DewarpNet在文档扭曲矫正方面的效果。同时，文章介绍了Learning from Documents in the Wild提升文档校正的方法，以及Geometric Representation Learning for Document Image Rectification的模型应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1 基于Radon变换

1.1 radon原理

二维Radon变换定义：
将（x,y）平面空间中的一条直线 $p=x\cos\theta+y\sin\theta$ 映射成Radon空间的一个点 $(p,\theta)$ ，
连续图像的Radon变换为： $R(p,\theta)=\iint_D f(x,y) \delta(p-x\cos\theta-y\sin\theta) \,dx\,dy$ ;
其中 $\delta(x)_=\begin{cases}0& \text{x!=0}\\1& \text{x=0}\end{cases}$ 。
在这里插入图片描述
通过randon变换可以计算出该二值图片在每个方向上每条直线的值，由上面的公式可以知道如果在这条直线上的文字越多其值越大，因此最大值所在的角度即是文字的倾斜角度。

1.2 代码实现

import cv2
import sys
import time
import numpy as np

from skimage.transform import radon


img = cv2.imread('C:/Users/64975/Desktop/test_image/tt.png')

# h = 900
# ratio = img.shape[0] / h
# w = img.shape[1] / ratio
# orig = img.copy()
# resize_image = cv2.resize(orig, (int(w), h))

I = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
h, w = I.shape
(cX, cY) = (w // 2, h // 2)


if (w > 240):
    I = cv2.resize(I, (240, int((h / w) * 240)))

I = I - np.mean(I) 
sinogram = radon(I)

r = np.array([np.sqrt(np.mean(np.abs(line) ** 2)) for line in sinogram.transpose()])
rotation = np.argmax(r)
print('Rotation: {:.2f} degrees'.format(90 - rotation))

# Rotate and save with the original resolution
M = cv2.getRotationMatrix2D((w/2, h/2), 90 - rotation, 1)

cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])

nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
dst = cv2.warpAffine(img, M, (nW, nH))

cv2.imwrite('C:/Users/64975/Desktop/test_image/out_image.jpg', dst)
cv2.imwrite('rotated.jpg', dst)
cv2.imshow('dst', dst)
cv2.waitKey()

1.3 效果

2 基于DewarpNet和DocUNet

2.1 论文调研

经过调研发现，TextIn的切边增强和弯曲矫正技术很强，他的技术参考的论文为DocUNet和DewarpNet。
DocUNet的代码地址：https://github.com/teresasun/docUnet.pytorch。
DewarpNet的代码地址：https://github.com/cvlab-stonybrook/DewarpNet。