验证码多种多样,我这里提供的方法仅对有噪点的验证码进行识别有效。
首先,这是我准备的原始图片 4.png
具体的实现代码
import tesserocr from PIL import Image, ImageDraw import time # image = Image.open("img/4_1.png") # fh = open("img/1.txt", "w") # w, h = image.size # 图片转文本,测试用 # for i in range(h): # for j in range(w): # cl = image.getpixel((j, i)) # clall = cl[0] + cl[1] + cl[2] # # clall == 0即当前像素为黑色 # if clall == 0: # fh.write("0") # else: # fh.write("1") # fh.write("\n") # fh.close() # 将图片转为黑白二色 def black_white(image): w, h = image.size for i in range(h): for j in range(w): cl = image.getpixel((j, i)) clall = cl[0] + cl[1] + cl[2] # clall == 0即当前像素为黑色 if clall >= 155*3: # 根据具体的图片修改 image.putpixel((j, i), (255, 255, 255)) else: image.putpixel((j, i), (0, 0, 0)) #二值数组 t2val = {} def twoValue(image,G): for y in range(0,image.size[1]): for x in range(0,image.size[0]): g = image.getpixel((x,y)) if g > G: t2val[(x,y)] = 1 else: t2val[(x,y)] = 0 # 降噪 # 根据一个点A的RGB值,与周围的8个点的RBG值比较,设定一个值N(0 <N <8),当A的RGB值与周围8个点的RGB相等数小于N时,此点为噪点 # G: Integer 图像二值化阀值 N: Integer 降噪率 0 <N <8 Z: Integer 降噪次数 def clearNoise(image,N,Z): for i in range(0,Z): t2val[(0,0)] = 1 t2val[(image.size[0] - 1,image.size[1] - 1)] = 1 for x in range(1,image.size[0] - 1): for y in range(1,image.size[1] - 1): nearDots = 0 L = t2val[(x,y)] if L == t2val[(x - 1,y - 1)]: nearDots += 1 if L == t2val[(x - 1,y)]: nearDots += 1 if L == t2val[(x- 1,y + 1)]: nearDots += 1 if L == t2val[(x,y - 1)]: nearDots += 1 if L == t2val[(x,y + 1)]: nearDots += 1 if L == t2val[(x + 1,y - 1)]: nearDots += 1 if L == t2val[(x + 1,y)]: nearDots += 1 if L == t2val[(x + 1,y + 1)]: nearDots += 1 if nearDots < N: t2val[(x,y)] = 1 def saveImage(filename,size): image = Image.new("1",size) draw = ImageDraw.Draw(image) for x in range(0,size[0]): for y in range(0,size[1]): draw.point((x,y),t2val[(x,y)]) image.save(filename) def start(img_path,save_img_path): image = Image.open(img_path) black_white(image) image = image.convert("L") twoValue(image,100) clearNoise(image,4,1) saveImage(save_img_path,image.size) print(tesserocr.file_to_text(save_img_path)) img_path = "img/4.png" save_img_path = "img/4_1.png" start(img_path, save_img_path)
经过处理后得到以下图片 4_1.png
控制台输出结果
ziri
不过以上是在理想情况下的实现,对于某些图片的识别率不高
等后期加上一些算法提高识别率把。
原文地址:https://www.cnblogs.com/YLTzxzy/p/11331128.html
时间: 2024-11-10 13:20:20