Showing posts with label computer-vision. Show all posts

Tuesday, September 6, 2022

[FIXED] Why is there an additional "None" dimension in the tensor shape when uploading a dataset to Activeloop Hub?

September 06, 2022 artificial-intelligence, computer-vision, hub, machine-learning, python No comments

Issue

I am trying to upload an image datset to Hub (dataset format with an API for creating, storing, & collaborating on AI datasets). I only uploaded part of the dataset, however upon inspecting the uploaded data I noticed that there was an additional None dimension in the tensor shape. Can someone explain why this occurred?

I am using the following tensor relationship:

ds 
-> images (htype = image)

Solution

The none dimension is present because some of the images might have three channels and the others have four, so dynamic dimensions are shown as None.

Answered By - Kristina from Activeloop

Answer Checked By - Robin (PHPFixing Admin)

[FIXED] How does mean image subtraction work?

July 29, 2022 caffe, computer-vision, conv-neural-network, image, machine-learning No comments

Issue

To preface, I am new to the field of ML/CV, and am currently in the process of training a custom conv net using Caffe.

I am interested in mean image subtraction to achieve basic data normalization on my training images. However, I am confused as to how mean subtraction works and exactly what benefits it has.

I know that a "mean image" can be calculated from the training set, which is then subtracted from the training, validation, and testing sets to make the network less sensitive to differing background and lightening conditions.

Does this involve calculating the mean of all pixels in each image, and averaging these? Or, is the value from each pixel coordinate averaged across all images in the set (i.e. average values of pixels at location (1,1) for all images)? This may require that all images are the same size...

Also, for colored images (3-channels), is the value for each channel individually averaged?

Any clarity would be appreciated.

Solution

In deep learning, there are in fact different practices as to how to subtract the mean image.

Subtract mean image

The first way is to subtract mean image as @lejlot described. But there is an issue if your dataset images are not the same size. You need to make sure all dataset images are in the same size before using this method (e.g., resize original image and crop patch of same size from original image). It is used in original ResNet paper, see reference here.

Subtract the per-channel mean

The second way is to subtract per-channel mean from the original image, which is more popular. In this way, you do not need to resize or crop the original image. You can just calculate the per-channel mean from the training set. This is used widely in deep learning, e.g, Caffe: here and here. Keras: here. PyTorch: here. (PyTorch also divide the per-channel value by standard deviation.)

Answered By - jdhao

Answer Checked By - Clifford M. (PHPFixing Volunteer)

[FIXED] How to apply perspective transformation to the image using open cv?

July 28, 2022 computer-vision, image, image-processing, opencv No comments

Issue

I am trying to apply perspective transformation to the image using open cv . I have the image of card in which I have converted the background color to black and foreground object as white color as shown in below image . Now I want to apply perspective transformation on it so that image gets properly viewed ?. My code is displaying just complete black thing .

Image:

Binary image

Code:

import cv2,numpy as np
from operator import itemgetter
from glob import glob
import matplotlib.pyplot as plt
input_image2 = cv2.imread("/home/hamza/Desktop/card_in_polygon_format.jpeg")

orig_im_coor = np.float32([[90, 261], [235, 386], [417, 178], [268, 83]])
height , width = 450,350
new_image_coor =  np.float32([[0, 0], [width, 0], [0, height], [width, height]])

P = cv2.getPerspectiveTransform(orig_im_coor,new_image_coor)

perspective = cv2.warpPerspective(input_image2,P,(width,height))
cv2.imshow("Perspective transformation", perspective)
cv2.waitKey(0)
cv2.destroyAllWindows()

Note: Every time my code will gets an image as black and white . If it capture the corners by itself too then it will be appreciate able instead of taking out it manually.

Solution

Automatic quadrangle fitting is not so trivial...

There is a good example in the following post, but it's implemented in C++.
The method I use is more like the following post - simpler, but less accurate.

The suggested solution uses the following stages:

Find contours, (and get the largest - needed in case there is more than one).
Approximate the contour to polygon using cv2.approxPolyDP.
Assume the polygon is a quadrangle.
Sort the 4 corners in the right order.
Note: The method I used for sorting the corner is too complicated - you may sort the corners using simple logic.

Here is a code sample:

import cv2
import numpy as np


def find_corners(im):
    """ 
    Find "card" corners in a binary image.
    Return a list of points in the following format: [[640, 184], [1002, 409], [211, 625], [589, 940]] 
    The points order is top-left, top-right, bottom-left, bottom-right.
    """

    # Better approach: https://stackoverflow.com/questions/44127342/detect-card-minarea-quadrilateral-from-contour-opencv

    # Find contours in img.
    cnts = cv2.findContours(im, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2]  # [-2] indexing takes return value before last (due to OpenCV compatibility issues).

    # Find the contour with the maximum area (required if there is more than one contour).
    c = max(cnts, key=cv2.contourArea)

    # https://stackoverflow.com/questions/41138000/fit-quadrilateral-tetragon-to-a-blob
    epsilon = 0.1*cv2.arcLength(c, True)
    box = cv2.approxPolyDP(c, epsilon, True)

    # Draw box for testing
    tmp_im = cv2.cvtColor(im, cv2.COLOR_GRAY2BGR)
    cv2.drawContours(tmp_im, [box], 0, (0, 255, 0), 2)
    cv2.imshow("tmp_im", tmp_im)

    box = np.squeeze(box).astype(np.float32)  # Remove redundant dimensions


    # Sorting the points order is top-left, top-right, bottom-right, bottom-left.
    # Note: 
    # The method I am using is a bit of an "overkill".
    # I am not sure if the implementation is correct.
    # You may sort the corners using simple logic - find top left, bottom right, and match the other two points.
    ############################################################################
    # Find the center of the contour
    # https://docs.opencv.org/3.4/dd/d49/tutorial_py_contour_features.html
    M = cv2.moments(c)
    cx = M['m10']/M['m00']
    cy = M['m01']/M['m00']
    center_xy = np.array([cx, cy])

    cbox = box - center_xy  # Subtract the center from each corner

    # For a square the angles of the corners are:
    # -135   -45
    #
    #
    # 135     45
    ang = np.arctan2(cbox[:,1], cbox[:,0]) * 180 / np.pi  # Compute the angles from the center to each corner

    # Sort the corners of box counterclockwise (sort box elements according the order of ang).
    box = box[ang.argsort()]
    ############################################################################

    # Reorder points: top-left, top-right, bottom-left, bottom-right
    coor = np.float32([box[0], box[1], box[3], box[2]])

    return coor


input_image2 = cv2.imread("card_in_polygon_format.jpeg", cv2.IMREAD_GRAYSCALE)  # Read image as Grayscale
input_image2 = cv2.threshold(input_image2, 0, 255, cv2.THRESH_OTSU)[1]  # Convert to binary image (just in case...)

# orig_im_coor = np.float32([[640, 184], [1002, 409], [211, 625], [589, 940]])

# Find the corners of the card, and sort them
orig_im_coor = find_corners(input_image2)

height, width = 450, 350
new_image_coor =  np.float32([[0, 0], [width, 0], [0, height], [width, height]])

P = cv2.getPerspectiveTransform(orig_im_coor, new_image_coor)

perspective = cv2.warpPerspective(input_image2, P, (width, height))
cv2.imshow("Perspective transformation", perspective)
cv2.waitKey(0)
cv2.destroyAllWindows()

Quadrangle fitting (not most accurate):

Answered By - Rotem

Answer Checked By - Willingham (PHPFixing Volunteer)

[FIXED] How does one convert a grayscale image to RGB in OpenCV (Python)?

July 28, 2022 computer-vision, image, image-processing, opencv, python No comments

Issue

I'm learning image processing using OpenCV for a realtime application. I did some thresholding on an image and want to label the contours in green, but they aren't showing up in green because my image is in black and white.

Early in the program I used gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) to convert from RGB to grayscale, but to go back I'm confused, and the function backtorgb = cv2.cvtColor(gray,cv2.CV_GRAY2RGB) is giving:

AttributeError: 'module' object has no attribute 'CV_GRAY2RGB'.

The code below does not appear to be drawing contours in green. Is this because it's a grayscale image? If so, can I convert the grayscale image back to RGB to visualize the contours in green?

import numpy as np
import cv2
import time

cap = cv2.VideoCapture(0)
while(cap.isOpened()):

    ret, frame = cap.read()

    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    ret, gb = cv2.threshold(gray,128,255,cv2.THRESH_BINARY)

    gb = cv2.bitwise_not(gb)

    contour,hier = cv2.findContours(gb,cv2.RETR_CCOMP,cv2.CHAIN_APPROX_SIMPLE)

    for cnt in contour:
        cv2.drawContours(gb,[cnt],0,255,-1)
    gray = cv2.bitwise_not(gb)

    cv2.drawContours(gray,contour,-1,(0,255,0),3)

    cv2.imshow('test', gray)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Solution

I am promoting my comment to an answer:

The easy way is:

You could draw in the original 'frame' itself instead of using gray image.

The hard way (method you were trying to implement):

backtorgb = cv2.cvtColor(gray,cv2.COLOR_GRAY2RGB)

is the correct syntax.

Answered By - Anoop K. Prabhu

Answer Checked By - Mary Flores (PHPFixing Volunteer)

[FIXED] How to extract object in high resolution images?

July 27, 2022 computer-vision, crop, deep-learning, image, opencv No comments

Issue

I am having an image as enclosed taken through a DSLR camera but its background is also seen on which the object is placed. I want to crop the object from the background. The image size is (3456,5184,3)

Sample image:

I tried a variety of solutions available ie., .using openCV methods like foreground extraction using grabcut, image thresholding and masking, edge detection with unsatisfactory results.

Please suggest the right approach.

Solution

Here's a method using thresholding + contour extraction

Grayscale then Gaussian blur
Otsu's threshold for binary image
Dilate to connect into a single contour
Extract ROI with numpy slicing

After converting to grayscale and Gaussian blurring, we Otsu's threshold

Now we have the desired foreground object in white, so we dilate to connect the contours to form a single contour

Finally we obtain the bounding box coordinates and extract the ROI

import cv2

# Grayscale, Blur, Otsu's threshold then dilate
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,25))
dilate = cv2.dilate(thresh, kernel, iterations=3)

# Extract ROI
x,y,w,h = cv2.boundingRect(dilate)
ROI = image[y:y+h, x:x+w]

cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('ROI', ROI)
cv2.waitKey()

Answered By - nathancy

Answer Checked By - Terry (PHPFixing Volunteer)

[FIXED] How to crop the given Irregularly shaped object along its outline in OpenCV

July 27, 2022 computer-vision, contour, crop, image-processing, opencv No comments

Issue

I have been working on a code where an image is given as shown I have to place this knife onto some other image. The condition is that I have to crop the knife along its outline and not in a rectangular box.

import numpy as np
import cv2 
from matplotlib import pyplot as plt

img = cv2.imread('2.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img)


img_blur = cv2.bilateralFilter(img, d = 7, 
                                sigmaSpace = 75, sigmaColor =75)

img_gray = cv2.cvtColor(img_blur, cv2.COLOR_RGB2GRAY)

a = img_gray.max()  
_, thresh = cv2.threshold(img_gray, a/2+60, a,cv2.THRESH_BINARY_INV)
plt.imshow(thresh, cmap = 'gray')


contours, hierarchy = cv2.findContours(
                                    image = thresh, 
                                    mode = cv2.RETR_TREE, 
                                    method = cv2.CHAIN_APPROX_SIMPLE)


contours = sorted(contours, key = cv2.contourArea, reverse = True)

img_copy = img.copy()
final = cv2.drawContours(img_copy, contours, contourIdx = -1, 
                      color = (255, 0, 0), thickness = 2)
plt.imshow(img_copy)

This is what I have tried but it doesn't seem to work well.

Input

Output

Solution

You can do it starting with bounding box using snake algorithm with balloon force added.

Snake's algo is defined such that it minimizes 3 energies - Continuity, Curvature and Gradient. The first two (together called internal energy) get minimized when points (on curve) are pulled closer and closer i.e. contract. If they expand then energy increases which is not allowed by snake algorithm.

But this initial algo proposed in 1987 has a few problems. One of the problem is that in flat areas (where gradient is zero) algo fails to converge and does nothing. There are several modifications proposed to solve this problem. The solution of interest here is - Balloon Force proposed by LD Cohen in 1989.

Balloon force guides the contour in non-informative areas of the image, i.e., areas where the gradient of the image is too small to push the contour towards a border. A negative value will shrink the contour, while a positive value will expand the contour in these areas. Setting this to zero will disable the balloon force.

Another improvement is - Morphological Snakes which use morphological operators (such as dilation or erosion) over a binary array instead of solving PDEs over a floating point array, which is the standard approach for active contours. This makes Morphological Snakes faster and numerically more stable than their traditional counterpart.

Scikit-image's implementation using the above two improvements is morphological_geodesic_active_contour. It has a parameter balloon

Using your image

import numpy as np
import matplotlib.pyplot as plt
from skimage.segmentation import morphological_geodesic_active_contour, inverse_gaussian_gradient
from skimage.color import rgb2gray
from skimage.util import img_as_float
from PIL import Image, ImageDraw

im = Image.open('knife.jpg')
im = np.array(im)
im = rgb2gray(im)
im = img_as_float(im)
plt.imshow(im, cmap='gray')

Now let us create a function which will help us to store iterations:

def store_evolution_in(lst):
    """Returns a callback function to store the evolution of the level sets in
    the given list.
    """

    def _store(x):
        lst.append(np.copy(x))

    return _store

This method needs image to be preprocessed to highlight the contours. This can be done using the function inverse_gaussian_gradient, although the user might want to define their own version. The quality of the MorphGAC segmentation depends greatly on this preprocessing step.

gimage = inverse_gaussian_gradient(im)

Below we define our starting point - a bounding box.

init_ls = np.zeros(im.shape, dtype=np.int8)
init_ls[200:-400, 20:-30] = 1

List with intermediate results for plotting the evolution

evolution = []
callback = store_evolution_in(evolution)

Now required magic line for morphological_geodesic_active_contour with balloon contraction is below:

ls = morphological_geodesic_active_contour(gimage, 200, init_ls, 
                                           smoothing=1, balloon=-0.75,
                                            threshold=0.7,
                                           iter_callback=callback)

Now let us plot the results:

fig, axes = plt.subplots(1, 2, figsize=(8, 8))
ax = axes.flatten()

ax[0].imshow(im, cmap="gray")
ax[0].set_axis_off()
ax[0].contour(ls, [0.5], colors='b')
ax[0].set_title("Morphological GAC segmentation", fontsize=12)

ax[1].imshow(ls, cmap="gray")
ax[1].set_axis_off()
contour = ax[1].contour(evolution[0], [0.5], colors='r')
contour.collections[0].set_label("Starting Contour")
contour = ax[1].contour(evolution[25], [0.5], colors='g')
contour.collections[0].set_label("Iteration 25")
contour = ax[1].contour(evolution[-1], [0.5], colors='b')
contour.collections[0].set_label("Last Iteration")
ax[1].legend(loc="upper right")
title = "Morphological GAC Curve evolution"
ax[1].set_title(title, fontsize=12)

plt.show()

With more balloon force you can get only the blade of knife as well.

ls = morphological_geodesic_active_contour(gimage, 100, init_ls, 
                                           smoothing=1, balloon=-1,
                                            threshold=0.7,
                                           iter_callback=callback)

Play with these parameters - smoothing, balloon, threshold to get your perfect curve

Answered By - Abhi25t

Answer Checked By - Willingham (PHPFixing Volunteer)

[FIXED] How to prune a Detectron2 model?

July 13, 2022 computer-vision, faster-rcnn, object-detection, pytorch, web-deployment No comments

Issue

I'm a teacher who is studying computer vision for months. I was very excited when I was able to train my first object detection model using Detectron2's Faster R-CNN model. And it works like a charm! Super cool!

But the problem is that, in order to increase the accuracy, I used the largest model in the model zoo.

Now I want to deploy this as something people can use to ease their job. But, the model is so large that it takes ~10 seconds to infer a single image on my CPU which is Intel i7-8750h.

Therefore, it's really difficult to deploy this model even on a regular cloud server. I need to use either GPU servers or latest model CPU servers which are really expensive and I'm not sure if I can even compensate for server expenses for months.

I need to make it smaller and faster for deployment.

So, yesterday I found that there's something like pruning the model!! I was very excited (since I'm not a computer or data scientists, don't blame me (((: )

I read official pruning documentation of PyTorch, but it's really difficult for me to understand.

I found global pruning is of the easiest one to do.

But the problem is, I have no idea what parameters should I write to prune.

Like I said, I used Faster R-CNN X-101 model. I have it as "model_final.pth". And it uses Base RCNN FPN.yaml and its meta architecture is "GeneralizedRCNN".

It seems like an easy configuration to do. But like I said, since it's not my field it's very hard for a person like me.

I'd be more than happy if you could help me on this step by step.

I'm leaving my cfg.yaml which I used training the model and I saved it using "dump" method in Detectron2 config class just in case. Here's the Drive link.

Thank you very much in advance.

Solution

So I guess, you are trying to optimize inference time and achieving satisfactory accuracy. Without knowing details about your object types, training size, image size, it will be hard to provide suggestions. However, as you know, ML project development is an iterative process, you can have a look at the following page and check inference and accuracy.

https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md#coco-object-detection-baselines

I would suggest, you try R50-FPN backbone and see how your accuracy comes. Then, you will get a better understanding of what to do next.

Answered By - CognitiveRobot

Answer Checked By - Willingham (PHPFixing Volunteer)

[FIXED] How to remove specific tag/sticker/object from images using OpenCV?

May 10, 2022 computer-vision, image, image-processing, opencv, python No comments

Issue

I have hundreds of images of jewelry products. Some of them have "best-seller" tag on them. The position of the tag is different from image to image. I want iterate over all images, and if an image has this tag then remove it. The resulted image will render the background over the removed object's pixels.

Example of an image with Tag/sticker/object:

Tag/sticker/object to remove:

import numpy as np
import cv2 as cv

img = plt.imread('./images/001.jpg')
sticker = plt.imread('./images/tag.png',1)
diff_im = cv2.absdiff(img, sticker)

I want the resulted image to be like this:

Solution

Here's an method using a modified scale-invariant Template Matching approach. The overall strategy:

Load template, convert to grayscale, perform canny edge detection
Load original image, convert to grayscale
Continuously rescale image, apply template matching using edges, and keep track of the correlation coefficient (higher value means better match)
Find coordinates of best fit bounding box then erase unwanted ROI

To begin, we load in the template and perform Canny edge detection. Applying template matching with edges instead of the raw image removes color variation differences and gives a more robust result. Extracting edges from template image:

Next we continuously scale down the image and apply template matching on our resized image. I maintain aspect ratio with each resize using a old answer. Here's a visualization of the strategy

The reason we resize the image is because standard template matching using cv2.matchTemplate will not be robust and may give false positives if the dimensions of the template and the image do not match. To overcome this dimension issue, we use this modified approach:

Continuously resize the input image at various smaller scales
Apply template matching using cv2.matchTemplate and keep track of the largest correlation coefficient
The ratio/scale with the largest correlation coefficient will have the best matched ROI

Once the ROI is obtained, we can "delete" the logo by filling in the rectangle with white using

cv2.rectangle(final, (start_x, start_y), (end_x, end_y), (255,255,255), -1)

Detected -> Removed

import cv2
import numpy as np

# Resizes a image and maintains aspect ratio
def maintain_aspect_ratio_resize(image, width=None, height=None, inter=cv2.INTER_AREA):
    # Grab the image size and initialize dimensions
    dim = None
    (h, w) = image.shape[:2]

    # Return original image if no need to resize
    if width is None and height is None:
        return image

    # We are resizing height if width is none
    if width is None:
        # Calculate the ratio of the height and construct the dimensions
        r = height / float(h)
        dim = (int(w * r), height)
    # We are resizing width if height is none
    else:
        # Calculate the ratio of the 0idth and construct the dimensions
        r = width / float(w)
        dim = (width, int(h * r))

    # Return the resized image
    return cv2.resize(image, dim, interpolation=inter)

# Load template, convert to grayscale, perform canny edge detection
template = cv2.imread('template.png')
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
template = cv2.Canny(template, 50, 200)
(tH, tW) = template.shape[:2]
cv2.imshow("template", template)

# Load original image, convert to grayscale
original_image = cv2.imread('1.png')
final = original_image.copy()
gray = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
found = None

# Dynamically rescale image for better template matching
for scale in np.linspace(0.2, 1.0, 20)[::-1]:

    # Resize image to scale and keep track of ratio
    resized = maintain_aspect_ratio_resize(gray, width=int(gray.shape[1] * scale))
    r = gray.shape[1] / float(resized.shape[1])

    # Stop if template image size is larger than resized image
    if resized.shape[0] < tH or resized.shape[1] < tW:
        break

    # Detect edges in resized image and apply template matching
    canny = cv2.Canny(resized, 50, 200)
    detected = cv2.matchTemplate(canny, template, cv2.TM_CCOEFF)
    (_, max_val, _, max_loc) = cv2.minMaxLoc(detected)

    # Uncomment this section for visualization
    '''
    clone = np.dstack([canny, canny, canny])
    cv2.rectangle(clone, (max_loc[0], max_loc[1]), (max_loc[0] + tW, max_loc[1] + tH), (0,255,0), 2)
    cv2.imshow('visualize', clone)
    cv2.waitKey(0)
    '''

    # Keep track of correlation value
    # Higher correlation means better match
    if found is None or max_val > found[0]:
        found = (max_val, max_loc, r)

# Compute coordinates of bounding box
(_, max_loc, r) = found
(start_x, start_y) = (int(max_loc[0] * r), int(max_loc[1] * r))
(end_x, end_y) = (int((max_loc[0] + tW) * r), int((max_loc[1] + tH) * r))

# Draw bounding box on ROI to remove
cv2.rectangle(original_image, (start_x, start_y), (end_x, end_y), (0,255,0), 2)
cv2.imshow('detected', original_image)

# Erase unwanted ROI (Fill ROI with white)
cv2.rectangle(final, (start_x, start_y), (end_x, end_y), (255,255,255), -1)
cv2.imshow('final', final)
cv2.waitKey(0)

Answered By - nathancy

Answer Checked By - Terry (PHPFixing Volunteer)

[FIXED] How to detect checkboxes by removing noise using Python OpenCV?

May 10, 2022 computer-vision, image, image-processing, opencv, python No comments

Issue

I am trying to identify the checkboxes in the image

The top 4 are identified but the bottom 2 are not. At the same time I would like to be able to get rid of the peppering to avoid false positives as there are other docs that have checkmarks that are much smaller. I've tried various dilation and kernel sizes but I haven't been able to successful get the box.

I've tried to dilate it and then erode it

kernel = np.ones((2, 2), np.uint8)
image_dilat = cv2.dilate(image, kernel, iterations=1)
kernel = np.ones((4, 4), np.uint8)
image_erosion = cv2.erode(image_dilat2, kernel, iterations=1)

I've tried morphing it as well

kernel = np.ones((3, 3), np.uint8)
image = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel, iterations=1)

kernel = np.ones((3, 3), np.uint8)
image = cv2.morphologyEx(image, cv2.cv2.MORPH_CLOSE, , kernel, iterations=1)

Any suggestion will be appreciated.

Solution

Here's a potential approach using simple image processing:

Obtain binary image. Load the image, convert to grayscale, and Otsu's threshold.
Remove small pixels of noise. Find contours and filter out noise using contour area filtering. We effectively remove the noise by "drawing in" the contours with black.
Repair checkbox walls. From here we create a horizontal and vertical repair kernel then perform morphological close to fix any holes in the checkbox walls.
Detect checkboxes. Next find contours on the repaired image then filter for checkbox contours using shape approximation and aspect ratio filtering. The idea is that a checkbox is a square and should have roughly the same width and height.

Binary image with noise -> Removed tiny noise

Repaired checkbox walls -> Detected checkboxes

Code

import cv2

# Load image, convert to grayscale, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
cv2.imshow('thresh before', thresh)

# Find contours and filter using contour area filtering to remove noise
cnts, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)[-2:]
AREA_THRESHOLD = 10
for c in cnts:
    area = cv2.contourArea(c)
    if area < AREA_THRESHOLD:
        cv2.drawContours(thresh, [c], -1, 0, -1)

# Repair checkbox horizontal and vertical walls
repair_kernel1 = cv2.getStructuringElement(cv2.MORPH_RECT, (5,1))
repair = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, repair_kernel1, iterations=1)
repair_kernel2 = cv2.getStructuringElement(cv2.MORPH_RECT, (1,5))
repair = cv2.morphologyEx(repair, cv2.MORPH_CLOSE, repair_kernel2, iterations=1)

# Detect checkboxes using shape approximation and aspect ratio filtering
cnts, _ = cv2.findContours(repair, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:]
for c in cnts:
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.05 * peri, True)
    x,y,w,h = cv2.boundingRect(approx)
    aspect_ratio = w / float(h)
    if aspect_ratio > 0.9 and aspect_ratio < 1.1:
        cv2.rectangle(original, (x, y), (x + w, y + h), (36,255,12), 3)

cv2.imshow('thresh', thresh)
cv2.imshow('repair', repair)
cv2.imshow('original', original)
cv2.waitKey()

Note: The assumption is that the checkboxes are square shaped and that there are no noise overlapping the checkboxes. Depending on the image, you may want to add another layer of contour area filtering to ensure that you don't get false positives.

Answered By - nathancy

Answer Checked By - Gilberto Lyons (PHPFixing Admin)

[FIXED] How to extract multiple objects from an image using Python OpenCV?

May 10, 2022 computer-vision, image, image-processing, opencv, python No comments

Issue

I am trying to extract object from an image using the color using OpenCV, I have tried by inverse thresholding and grayscale combined with cv2.findContours() but I am unable to use it recursively. Furthermore I can't figure out how to "cut out" the match from the original image and save it to a single file.

EDIT

~
import cv2
import numpy as np

# load the images
empty = cv2.imread("empty.jpg")
full = cv2.imread("test.jpg")

# save color copy for visualization
full_c = full.copy()

# convert to grayscale
empty_g = cv2.cvtColor(empty, cv2.COLOR_BGR2GRAY)
full_g = cv2.cvtColor(full, cv2.COLOR_BGR2GRAY)

empty_g = cv2.GaussianBlur(empty_g, (51, 51), 0)
full_g = cv2.GaussianBlur(full_g, (51, 51), 0)
diff = full_g - empty_g

#  thresholding

diff_th = 
cv2.adaptiveThreshold(full_g,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, 
cv2.THRESH_BINARY,11,2)

# combine the difference image and the inverse threshold
zone = cv2.bitwise_and(diff, diff_th, None)

# threshold to get the mask instead of gray pixels
_, zone = cv2.threshold(bag, 100, 255, 0)

# dilate to account for the blurring in the beginning
kernel = np.ones((15, 15), np.uint8)
bag = cv2.dilate(bag, kernel, iterations=1)

# find contours, sort and draw the biggest one
contours, _ = cv2.findContours(bag, cv2.RETR_TREE,
                              cv2.CHAIN_APPROX_SIMPLE)
contours = sorted(contours, key=cv2.contourArea, reverse=True)[:3]
i = 0
while i < len(contours):
    x, y, width, height = cv2.boundingRect(contours[i])
    roi = full_c[y:y+height, x:x+width]
    cv2.imwrite("piece"+str(i)+".png", roi)
    i += 1

Where empty is just a white image size 1500 * 1000 as the one above and test is the one above.

This is what I came up with, only downside, I have a third image instead of only the 2 expected showing a shadow zone now...

Solution

Here's a simple approach:

Obtain binary image. Load the image, grayscale, Gaussian blur, Otsu's threshold, then dilate to obtain a binary black/white image.
Extract ROI. Find contours, obtain bounding boxes, extract ROI using Numpy slicing, and save each ROI

Binary image (Otsu's thresholding + dilation)

Detected ROIs highlighted in green

To extract each ROI, you can find the bounding box coordinates using cv2.boundingRect(), crop the desired region, then save the image

x,y,w,h = cv2.boundingRect(c)
ROI = original[y:y+h, x:x+w]

First object

Second object

import cv2

# Load image, grayscale, Gaussian blur, Otsu's threshold, dilate
image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (7,7))
dilate = cv2.dilate(thresh, kernel, iterations=1)

# Find contours, obtain bounding box coordinates, and extract ROI
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
image_number = 0
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
    ROI = original[y:y+h, x:x+w]
    cv2.imwrite("ROI_{}.png".format(image_number), ROI)
    image_number += 1

cv2.imshow('image', image)
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.waitKey()

Answered By - nathancy

Answer Checked By - Cary Denson (PHPFixing Admin)

[FIXED] How to detect and find checkboxes in a form using Python OpenCV?

May 10, 2022 computer-vision, image, image-processing, opencv, python No comments

Issue

I have several images for which I need to do OMR by detecting checkboxes using computer vision.

I'm using findContours to draw contours only on the checkboxes in scanned document. But the algorithm extracts each and every contours of the text.

from imutils.perspective import four_point_transform
from imutils import contours
import numpy as np
import argparse, imutils, cv2, matplotlib
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

image = cv2.imread("1.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 75, 200)

im_test = [blurred, cv2.GaussianBlur(gray, (7, 7), 0), cv2.GaussianBlur(gray, (5, 5), 5), cv2.GaussianBlur(gray, (11, 11), 0)]
im_thresh = [ cv2.threshold(i, 127, 255, 0)  for i in im_test ]
im_thresh_0 = [i[1] for i in im_thresh ]
im_cnt = [cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)[0] for thresh in im_thresh_0]

im_drawn = [cv2.drawContours(image.copy(), contours, -1, (0,255,0), 1) for contours in im_cnt]

plt.imshow(im_drawn[0])
plt.show()

Input Image:

Solution

Obtain binary image. Load the image, grayscale, Gaussian blur, and Otsu's threshold to obtain a binary black/white image.
Remove small noise particles. Find contours and filter using contour area filtering to remove noise.
Repair checkbox horizontal and vertical walls. This step is optional but in the case where the checkboxes may be damaged, we repair the walls for easier detection. The idea is to create a rectangular kernel then perform morphological operations.
Detect checkboxes. From here we find contours, obtain the bounding rectangle coordinates, and filter using shape approximation + aspect ratio. The idea is that a checkbox is essentially a square so its contour dimensions should be within a range.

Input image -> Binary image

Detected checkboxes highlighted in green

Checkboxes: 52

Another input image -> Binary image

Detected checkboxes highlighted in green

Checkboxes: 2

Code

import cv2

# Load image, convert to grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Find contours and filter using contour area filtering to remove noise
cnts, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)[-2:]
AREA_THRESHOLD = 10
for c in cnts:
    area = cv2.contourArea(c)
    if area < AREA_THRESHOLD:
        cv2.drawContours(thresh, [c], -1, 0, -1)

# Repair checkbox horizontal and vertical walls
repair_kernel1 = cv2.getStructuringElement(cv2.MORPH_RECT, (5,1))
repair = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, repair_kernel1, iterations=1)
repair_kernel2 = cv2.getStructuringElement(cv2.MORPH_RECT, (1,5))
repair = cv2.morphologyEx(repair, cv2.MORPH_CLOSE, repair_kernel2, iterations=1)

# Detect checkboxes using shape approximation and aspect ratio filtering
checkbox_contours = []
cnts, _ = cv2.findContours(repair, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:]
for c in cnts:
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.035 * peri, True)
    x,y,w,h = cv2.boundingRect(approx)
    aspect_ratio = w / float(h)
    if len(approx) == 4 and (aspect_ratio >= 0.8 and aspect_ratio <= 1.2):
        cv2.rectangle(original, (x, y), (x + w, y + h), (36,255,12), 3)
        checkbox_contours.append(c)

print('Checkboxes:', len(checkbox_contours))
cv2.imshow('thresh', thresh)
cv2.imshow('repair', repair)
cv2.imshow('original', original)
cv2.waitKey()

Answered By - nathancy

Answer Checked By - Mary Flores (PHPFixing Volunteer)

[FIXED] How to detect corners of a square with Python OpenCV?

May 10, 2022 computer-vision, image, image-processing, opencv, python No comments

Issue

In the image below, I am using OpenCV harris corner detector to detect only the corners for the squares (and the smaller squares within the outer squares). However, I am also getting corners detected for the numbers on the side of the image. How do I get this to focus only on the squares and not the numbers? I need a method to ignore the numbers when performing OpenCV corner detection. The code, input image and output image are below:

import cv2 as cv
img = cv.imread(filename)
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
gray = np.float32(gray)
dst = cv.cornerHarris(gray, 2, 3, 0.04)
dst = cv.dilate(dst,None)
# Threshold for an optimal value, it may vary depending on the image.
img[dst>0.01*dst.max()]=[0,0,255]
cv.imshow('dst', img)

Input image

Output from Harris corner detector

Solution

Here's a potential approach using traditional image processing:

Obtain binary image. We load the image, convert to grayscale, Gaussian blur, then adaptive threshold to obtain a black/white binary image. We then remove small noise using contour area filtering. At this stage we also create two blank masks.
Detect horizontal and vertical lines. Now we isolate horizontal lines by creating a horizontal shaped kernel and perform morphological operations. To detect vertical lines, we do the same but with a vertical shaped kernel. We draw the detected lines onto separate masks.
Find intersection points. The idea is that if we combine the horizontal and vertical masks, the intersection points will be the corners. We can perform a bitwise-and operation on the two masks. Finally we find the centroid of each intersection point and highlight corners by drawing a circle.

Here's a visualization of the pipeline

Input image -> binary image

Detected horizontal lines -> horizontal mask

Detected vertical lines -> vertical mask

Bitwise-and both masks -> detected intersection points -> corners -> cleaned up corners

The results aren't perfect but it's pretty close. The problem comes from the noise on the vertical mask due to the slanted image. If the image was centered without an angle, the results would be ideal. You can probably fine tune the kernel sizes or iterations to get better results.

Code

import cv2
import numpy as np

# Load image, create horizontal/vertical masks, Gaussian blur, Adaptive threshold
image = cv2.imread('1.png')
original = image.copy()
horizontal_mask = np.zeros(image.shape, dtype=np.uint8)
vertical_mask = np.zeros(image.shape, dtype=np.uint8)
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.adaptiveThreshold(blur, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 23, 7)

# Remove small noise on thresholded image
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    area = cv2.contourArea(c)
    if area < 150:
        cv2.drawContours(thresh, [c], -1, 0, -1)

# Detect horizontal lines
dilate_horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (10,1))
dilate_horizontal = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, dilate_horizontal_kernel, iterations=1)
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (40,1))
detected_lines = cv2.morphologyEx(dilate_horizontal, cv2.MORPH_OPEN, horizontal_kernel, iterations=1)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    cv2.drawContours(image, [c], -1, (36,255,12), 2)
    cv2.drawContours(horizontal_mask, [c], -1, (255,255,255), 2)

# Remove extra horizontal lines using contour area filtering
horizontal_mask = cv2.cvtColor(horizontal_mask,cv2.COLOR_BGR2GRAY)
cnts = cv2.findContours(horizontal_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    area = cv2.contourArea(c)
    if area > 1000 or area < 100:
        cv2.drawContours(horizontal_mask, [c], -1, 0, -1)

# Detect vertical 
dilate_vertical_kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (1,7))
dilate_vertical = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, dilate_vertical_kernel, iterations=1)
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1,2))
detected_lines = cv2.morphologyEx(dilate_vertical, cv2.MORPH_OPEN, vertical_kernel, iterations=4)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    cv2.drawContours(image, [c], -1, (36,255,12), 2)
    cv2.drawContours(vertical_mask, [c], -1, (255,255,255), 2)

# Find intersection points
vertical_mask = cv2.cvtColor(vertical_mask,cv2.COLOR_BGR2GRAY)
combined = cv2.bitwise_and(horizontal_mask, vertical_mask)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (2,2))
combined = cv2.morphologyEx(combined, cv2.MORPH_OPEN, kernel, iterations=1)

# Highlight corners
cnts = cv2.findContours(combined, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    # Find centroid and draw center point
    try:
        M = cv2.moments(c)
        cx = int(M['m10']/M['m00'])
        cy = int(M['m01']/M['m00'])
        cv2.circle(original, (cx, cy), 3, (36,255,12), -1)
    except ZeroDivisionError:
        pass

cv2.imshow('thresh', thresh)
cv2.imshow('horizontal_mask', horizontal_mask)
cv2.imshow('vertical_mask', vertical_mask)
cv2.imshow('combined', combined)
cv2.imshow('original', original)
cv2.imshow('image', image)
cv2.waitKey()

Answered By - nathancy

Answer Checked By - Clifford M. (PHPFixing Volunteer)

[FIXED] How to approximate jagged edges as lines using Python OpenCV?

May 10, 2022 computer-vision, image, image-processing, opencv, python No comments

Issue

I am trying to find accurate locations for the corners on ink blotches as seen below:

My idea is to fit lines to the edges and then find where they intersect. As of now, I've tried using cv2.approxPolyDP() with various values of epsilon to approximate the edges, however this doesn't look like the way to go. My cv2.approxPolyDP code gives the following result:

Ideally, this is what I want to produce (drawn on paint):

Are there CV functions in place for this sort of problem? I've considered using Gaussian blurring before the threshold step although that method does not seem like it would be very accurate for corner finding. Additionally, I would like this to be robust to rotated images, so filtering for vertical and horizontal lines won't necessarily work without other considerations.

Code:*

import numpy as np
from PIL import ImageGrab
import cv2


def process_image4(original_image):  # Douglas-peucker approximation
    # Convert to black and white threshold map
    gray = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
    gray = cv2.GaussianBlur(gray, (5, 5), 0)
    (thresh, bw) = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)

    # Convert bw image back to colored so that red, green and blue contour lines are visible, draw contours
    modified_image = cv2.cvtColor(bw, cv2.COLOR_GRAY2BGR)
    contours, hierarchy = cv2.findContours(bw, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    cv2.drawContours(modified_image, contours, -1, (255, 0, 0), 3)

    # Contour approximation
    try:  # Just to be sure it doesn't crash while testing!
        for cnt in contours:
            epsilon = 0.005 * cv2.arcLength(cnt, True)
            approx = cv2.approxPolyDP(cnt, epsilon, True)
            # cv2.drawContours(modified_image, [approx], -1, (0, 0, 255), 3)
    except:
        pass
    return modified_image


def screen_record():
    while(True):
        screen = np.array(ImageGrab.grab(bbox=(100, 240, 750, 600)))
        image = process_image4(screen)
        cv2.imshow('window', image)
        if cv2.waitKey(25) & 0xFF == ord('q'):
            cv2.destroyAllWindows()
            break

screen_record()

A note about my code: I'm using screen capture so that I can process these images live. I have a digital microscope that can display live feed on a screen, so the constant screen recording will allow me to sample from the video feed and locate the corners live on the other half of my screen.

Solution

Here's a potential solution using thresholding + morphological operations:

Obtain binary image. We load the image, blur with cv2.bilateralFilter(), grayscale, then Otsu's threshold
Morphological operations. We perform a series of morphological open and close to smooth the image and remove noise
Find distorted approximated mask. We find the bounding rectangle coordinates of the object with cv2.arcLength() and cv2.approxPolyDP() then draw this onto a mask
Find corners. We use the Shi-Tomasi Corner Detector already implemented as cv2.goodFeaturesToTrack() for corner detection. Take a look at this for an explanation of each parameter

Here's a visualization of each step:

Binary image -> Morphological operations -> Approximated mask -> Detected corners

Here are the corner coordinates:

(103, 550)
(1241, 536)

Here's the result for the other images

(558, 949)
(558, 347)

Finally for the rotated image

(201, 99)
(619, 168)

Code

import cv2
import numpy as np

# Load image, bilaterial blur, and Otsu's threshold
image = cv2.imread('1.png')
mask = np.zeros(image.shape, dtype=np.uint8)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.bilateralFilter(gray,9,75,75)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Perform morpholgical operations
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (10,10))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
close = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel, iterations=1)

# Find distorted rectangle contour and draw onto a mask
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
rect = cv2.minAreaRect(cnts[0])
box = cv2.boxPoints(rect)
box = np.int0(box)
cv2.drawContours(image,[box],0,(36,255,12),4)
cv2.fillPoly(mask, [box], (255,255,255))

# Find corners
mask = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)
corners = cv2.goodFeaturesToTrack(mask,4,.8,100)
offset = 25
for corner in corners:
    x,y = corner.ravel()
    cv2.circle(image,(x,y),5,(36,255,12),-1)
    x, y = int(x), int(y)
    cv2.rectangle(image, (x - offset, y - offset), (x + offset, y + offset), (36,255,12), 3)
    print("({}, {})".format(x,y))
    
cv2.imshow('image', image)
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('mask', mask)
cv2.waitKey()

Note: The idea for the distorted bounding box came from a previous answer in How to find accurate corner positions of a distorted rectangle from blurry image

Answered By - nathancy

Answer Checked By - Terry (PHPFixing Volunteer)

Tuesday, September 6, 2022

Issue

Solution

Friday, July 29, 2022

Issue

Solution

Subtract mean image

Subtract the per-channel mean

Thursday, July 28, 2022

Issue

Solution

Issue

Solution

Wednesday, July 27, 2022

Issue

Solution

Issue

Solution

Wednesday, July 13, 2022

Issue

Solution

Tuesday, May 10, 2022

Issue

Solution

Issue

Solution

Issue

EDIT

Solution

Issue

Solution

Issue

Solution

Issue

Solution

Total Pageviews

Featured Post

Subscribe To