Detecting image corner features in images with ASN in python

Source code : https://github.com/intellar/ASN


This post gives a concise example of image processing to find corner like features in images with python. Corner are localized  with good precision using quadratic interpolation. We focus on the detection of special corner like those used in camera calibration, that are formed by a single square or two square joining at their corner. The next images are two examples of these "corners":


To find the position of the corners, we use an image processing method published in 2008 ( thats  long long ago !) by jnouellet,  the ASN operator. 

ASN: Image Keypoint Detection from Adaptive Shape Neighborhood, J-N. Ouellet, P. Hebert, in European Conference on Computer Vision (ECCV), 2008, published by Springer Berlin / Heidelberg. PDF

This detector is particularly well suited to detect and localize these kind of features. Throughout this blog, I will use figures taken from the author Ph.D. thesis, with his permission. ASN stands for adaptive shape neighborhood, because the method will find image features and also find a region of the image where all pixels define/support each feature.  The next images illustrate the ASN workflow, on the left we see the location estimations, in the center we see the formation of poles of attraction when the estimation is done near a corner, and on the right we see the region of attraction in white where all estimations found the same pole of attraction. 

The idea behind the ASN operator is to view each pixel of the image as defining a line oriented normal to the image gradient, and passing though the pixel center. The gradient is the derivative of the image in the horiontal and vertical direction (dx,dy) and its value indicates how fast the intensity is changing in each direction. It can be obtained with Sobel derivative operator, or any other derivative filtering method. A line passing through pixel p=[x0, y0,1] with gradient [dx0, dy0]  has coeficient  L = [dx0, dy0, -(x0dx0+y0dy0) ] such that Lp=0. 

With that in mind, given a region of the image encompassing a corner, we can find the position of the corner which should be at the intersection of all the lines. If the n rows matrix A contain the n lines normal vectors of the region, 

A = [ [dx0, dy0],
[dx1, dy1],
...,
[dxn, dyn] ],   

and column vector b has the third line coeficient,  

b=[[(x0dx0+y0dy0)],

[(x1dx1+y1dy1)],

…,
      [(xndxn+yndyn)] ],  

we can see it forms a equation system Ax=b,   the solution being x=(AtA)-1Atb ,  t is the transpose. A clever thing here is that the weight of each line in the linear solution is proportional to the gradient magnitude : a pixel closer to an edge will count more in the estimation than a pixel in a flat uniform region.

The last step is to solve the equation system for all pixel of the image. By looking again at the solution equation, we can derive an interesting trick that will speed up tremendously the generation of all solution.  We explicit the components of (AtA)-1 and  Atb,  to obtain 

(AtA)-1= [ [sum(dxi2), sum(dxidyi)],
[sum(dxidyi), sum(dyi2)]].  

if we rewrite sum(dxi2) as Dx2 , and so on, (AtA)-1  is 

(AtA)-1  =(    [Dx2,DxDy],
  [DxDy,Dy2])-1  

from linear algebra, we can find the inverse of a 2x2 matrix as a simple expression, is this case 

(AtA)-1 = [ [Dy2, -DxDy],
[-DxDy,Dx2] ]  /  det(A).  
where det(A) is the determinant of A. Again, the determinant of a 2x2 matrix is a simple expression from its coeficient, that is  det(A)= Dx2Dy2-(DxDy)**2.   The same thing can be done for the Atb product, giving
Atb= [ [ XDx2+YDxDy],
XDxDy+YDy2]]
where the matrices X and Y contain the x, and y coordinates of each position in the image.

Now why is this interesting. 

Check the term sum(dxi2). It is the sum of all squared x derivative in the image region. Suppose the region is a mask, a circular disk of ones. The sum can be computed very efficiently at each pixel by filtering/convoluting an image composed of dx^2 with this mask. The same goes for sum(dyi2) and  sum(dxidyi)  The following code will find the solution at each pixels, creating a solution map, where each position of a matrix as big as the image will contain the solution of the linear regression, localSolutionX and localSolutionY : 

The next step is to go through the solution map, and generate a sort of heat map of all the solutions, a vote for probable location of the corners. For this, we initialize a zeros matrix big as the image and loop all positions of the solutionx and solutiony map to increment its value. Since a solution will be a subpixel position (solution is not at the pixel center, but somewhere inside), we also increment the pixel neighboors using a bilinear coeficient computed from the distance to each pixel. See the next code extract for the details. 

The accumulator has local maximums where the corners are probably located. Still, this is not precise since the pixels are a sampling of a surface having a much higher resolution. The figure on the left below illustrate the continuous surface resulting from the image formation process, generating the discrete pixel value on the cmos of the camera. In this implementation, I interpolate the corner maximum using a quadratic function fitted on the 9 values at the maximum.  This is a good enough approximation and results in the green cross in the image on the right below. The article ouellet2008 explains a method to extract the actual corner position with a much better precision using all pixels that contributed to the maximum formation.  This is shown next.

To extract the convergence region, you need to find the position that voted for the max. This can be accumulated at the same time we create the accumulator. Instead of accumulating votes, we accumulate positions. Afterward, when the max of the accumulator are identified, it is possible to recompute a solution using all the Dx2,Dy2,DxDx,... pointed by the accumulator. The code on github is self explanatory. You can show the region and the solution overlayed on the original image.  The code does not show the full extent of the region, it shows the region of the filtered image Dx2,... , which are created with a convolution on the raw gradient. To show the full extent, the region should be increased with a morphological disk filter having the same size as the convolution mask.

The result of the ASN detector on a test image :