网站首页 > 技术文章正文

字符识别之cnocr和cnstd使用

nanyue 2024-11-27 18:16:28 技术文章 5 ℃

原创春风视觉

介绍
1.1 cnocrcnocr是一款文字识别python包，支持中文和英文等多种语言，主要针对排版简单的印刷体文字图片，通常可结合文字检测引擎使用。该识别引擎遵循OSI Approved :: Apache Software License协议，在github地址：https://github.com/breezedeus/cnocr/#readme。python版本需要大于3。1.2 cnstd
cnstd是一款文字检测工具，支持中文和英文等检测，自带多个训练模型，安装时使用pip安装即可，通常与cnocr一起使用。遵循协议为：
OSI Approved :: Apache Software License，支持python3.X以上版本，在github地址中为：https://github.com/breezedeus/cnstd。
安装与依赖包对图像进行处理需要安装opencv包，具体查看各自github下的requirments.txt文件，执行pip批量安装命令即可。

pip install -r requirements.txt

执行上述命令后执行下列命令：

#安装cnocr  安装过程中会安装torch，可新建虚拟环境测试
pip install cnocr 或
pip install cnocr -i https://pypi.doubanio.com/simple  #指定源
#安装cnstd
pip install cnstd  或
pip install cnstd -i https://pypi.doubanio.com/simple  #指定源

3. 检测与调用

3.1 cnstd使用

cnstd使用比较简单，执行cmd命令，执行下列命令后得结果。

cnstd predict -i examples/taobao.jpg -o outputs

从上图得结果可以看出cnstd可以准确得识别出文字位置。除cmd命令外，还可以执行下面得py代码。

from cnstd import CnStd
from cnocr import CnOcr
import cv2 as cv
std = CnStd(auto_rotate_whole_image=True)  #这里指定模型及其他参数
cn_ocr = CnOcr()
box_info_list = std.detect('E:/100.png')
image = cv.imread('E:/100.png')
point_color = (0, 255, 0) # BGR
thickness = 1
lineType = 4
white = (0, 255, 128)
for box_info in box_info_list['detected_texts']:
    # 获取坐标
    cor = box_info['box']
    #print(box_info)
    box = cor[0:4]
    ptLeftTop=(0,0)
    ptRightBottom=(0,0)
    if abs(box[-1]) > 80:
        #存在倾斜或旋转得状况，此时中心点不变，长和宽位置互换
        ptLeftTop = (int(box[0]-box[3]/2), int(box[1] - box[2]/2)) #int(box[1])
        ptRightBottom = (int(box[0] + box[3]/2),int(box[1]+box[2]/2))
    else:
        ptLeftTop = (int(box[0]-box[2]/2), int(box[1] - box[3]/2)) #int(box[1])
        ptRightBottom = (int(box[0]+box[2]/2), int(box[1] + box[3]/2))
    cv.rectangle(image, ptLeftTop,ptRightBottom, point_color, 1)
    #cv.circle(image, ptLeftTop, 5, white)
    #cv.circle(image, ptRightBottom, 10, (0,0,0))
cv.imwrite("result_cnstd.png", image)

得到的结果如下图：

3.2 cnocr使用

按照上面代码继续执行如下代码块：

for box_info in box_info_list['detected_texts']:
    cropped_img = box_info['cropped_img']
    ocr_res = cn_ocr.ocr_for_single_line(cropped_img)
    print('ocr result: %s' % str(ocr_res))

执行代码完成后，结果如下图所示，以元组形式展示，第一个元素表示字符内容，第二个元素表示置信度。

除可以进行单行识别外，还可以进行整体识别。此时结果与上面结果大体相同。执行代码片段如下。

from cnocr import CnOcr
ocr = CnOcr()
res = ocr.ocr('E:/100.png')
print("Predicted Chars:", res)

对cnocr的其他测试情况，见github库中的测试用例，地址如下。https://github.com/breezedeus/cnocr/blob/master/tests/test_cnocr.py。

上一篇： Python 绘图以及文件的基本操作
下一篇：将同步阻塞三方库包转换为异步非阻塞模式，Python3.10实现。

网站首页 > 技术文章 正文

字符识别之cnocr和cnstd使用

猜你喜欢

网站首页 > 技术文章正文