TensorFlow CAPTCHA solver

Posted January 17, 2019 by Rokas Balsys

Tensorflow CAPTCHA solver final part

This tutorial code you can download from this link.

In this tutorial we are continuing with code from part 3, so if you didn't watched that part, I recommend to do so. In this tutorial we are only writing a code to get correct CAPTCHA results from detected image. In this tutorial we are grabbing all symbols from CAPTCHA, checking order of symbols, detection accuracy and overlappings. We use all these components to write final working out of the box CAPTCHA solver.

Text version will not be as detailed as video tutorial, if you are interested what every line in code does and why we need that exact line, you should watch my YouTube tutorial. While writing this code I was trying to explain every line of code. In this text version tutorial I tried to add few comment in code to help you understand at least what that code part does.

So bellow is the final code, which you can copy and use. But to use it you will need to install TensorFlow and all necessary libraries. Moreover you will need "CAPTCHA_frozen_inference_graph.pb" this is my trained CAPTCHA detection model and "CAPTCHA_labelmap.pbtxt". When you have all your files prepared you should be able to use this detection model. Link to download my code is above, it even includes few image examples that you could test if it works for you.

Here is detection code:

# Welcome to CAPTCHA break tutorial !

# Imports
import cv2
import numpy as np
import os
import sys
# run on CPU, to run on GPU comment this line or write '0'
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
import tensorflow as tf
from distutils.version import StrictVersion
from collections import defaultdict

# title of our window
title = "CAPTCHA"

# Env setup
from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util

# Model preparation 
PATH_TO_FROZEN_GRAPH = 'CAPTCHA_frozen_inference_graph.pb'
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = 'CAPTCHA_labelmap.pbtxt'

# Load a (frozen) Tensorflow model into memory.
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
        serialized_graph = fid.read()
        tf.import_graph_def(od_graph_def, name='')

# Detection
def Captcha_detection(image, average_distance_error=3):
    with detection_graph.as_default():
        with tf.Session(graph=detection_graph) as sess:
            # Open image
            image_np = cv2.imread(image)
            # Resize image if needed
            image_np = cv2.resize(image_np, (0,0), fx=3, fy=3) 
            # To get real color we do this:
            image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)
            # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
            image_np_expanded = np.expand_dims(image_np, axis=0)
            # Actual detection.
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            num_detections = detection_graph.get_tensor_by_name('num_detections:0')
            # Visualization of the results of a detection.
            (boxes, scores, classes, num_detections) = sess.run(
              [boxes, scores, classes, num_detections],
              feed_dict={image_tensor: image_np_expanded})
            # Show image with detection
            #cv2.imshow(title, cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB))
            # Save image with detection
            cv2.imwrite("Predicted_captcha.jpg", cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB))

            # Bellow we do filtering stuff
            captcha_array = []
            # loop our all detection boxes
            for i,b in enumerate(boxes[0]):
                for Symbol in range(37):
                    if classes[0][i] == Symbol: # check if detected class equal to our symbols
                        if scores[0][i] >= 0.65: # do something only if detected score more han 0.65
                                            # x-left        # x-right
                            mid_x = (boxes[0][i][1]+boxes[0][i][3])/2 # find x coordinates center of letter
                            # to captcha_array array save detected Symbol, middle X coordinates and detection percentage
                            captcha_array.append([category_index[Symbol].get('name'), mid_x, scores[0][i]])

            # rearange array acording to X coordinates datected
            for number in range(20):
                for captcha_number in range(len(captcha_array)-1):
                    if captcha_array[captcha_number][1] > captcha_array[captcha_number+1][1]:
                        temporary_captcha = captcha_array[captcha_number]
                        captcha_array[captcha_number] = captcha_array[captcha_number+1]
                        captcha_array[captcha_number+1] = temporary_captcha

            # Find average distance between detected symbols
            average = 0
            captcha_len = len(captcha_array)-1
            while captcha_len > 0:
                average += captcha_array[captcha_len][1]- captcha_array[captcha_len-1][1]
                captcha_len -= 1
            # Increase average distance error
            average = average/(len(captcha_array)+average_distance_error)

            captcha_array_filtered = list(captcha_array)
            captcha_len = len(captcha_array)-1
            while captcha_len > 0:
                # if average distance is larger than error distance
                if captcha_array[captcha_len][1]- captcha_array[captcha_len-1][1] < average:
                    # check which symbol has higher detection percentage
                    if captcha_array[captcha_len][2] > captcha_array[captcha_len-1][2]:
                        del captcha_array_filtered[captcha_len-1]
                        del captcha_array_filtered[captcha_len]
                captcha_len -= 1

            # Get final string from filtered CAPTCHA array
            captcha_string = ""
            for captcha_letter in range(len(captcha_array_filtered)):
                captcha_string += captcha_array_filtered[captcha_letter][0]
            return captcha_string

To use above code, we can insert "print(Captcha_detection("10.jpg"))" line or we can call this function from another file. So I created a new python script called "main_.py" and wrote following lines to it. This way we will try to solve "10.jpg" CAPTCHA:

from CAPTCHA_object_detection import *


Into code I added one line, to increase size of CAPTCHAS by 3 times, because usually they are really small. Without resizing image our model would detect it in a bad way.

If you were trying this tutorial and it worked for you please write your experience under my YouTube video. So I will know if it worked for you or not. Also if you tried it but it didn't worked for you, you can ask me an advice under my YouTube video, I will try to help you.

Here is few CAPTCHA image results:

Returned result: UQBOZ

Returned result: POQQP

Returned result: IAAFB

That’s all for this CAPTCHA solver tutorial. I got results that I wanted to get, so you can do the same. Now I am able solve any simplier CAPTCHA with this code. With this code we can't solve verry difficult CAPTCHAS, which have few words in it. But I think this way we can solve at least 95% CAPTHCA's used on the internet, except google reCAPTCHA. In future I may write another tutorial to solve hard CAPTCHA with word combinations or google reCAPTCHA, but for now I am satisfied what I have. As I mentioned before, if you want to improve this model, please send me your data set I will retrain model for you and upload a better detection model. Thank you all for following me and keep learning on another tutorials!