Harris corner detector

Introduction

So called “corners” are distinctive points in an image. Since they are distinctive, corners can be used for object detection and recognition, image matching and video tracking among other applications.

What makes a point a good corner? It might be easier to describe a bad corner. Think about an image patch which consists of a single color only. It would be hard to identify the corresponding patch in a similar image. Conversely, a good corner is characterized by the presence of irregularities or variations in the form of edges or confluence of edges.

How to find corners

Consider a point $(x,y)$ in an image. We want to determine whether this point, or rather an image patch around this point, makes for a good corner.

Harris' intuition is that a bad corner (say, a uniform image patch) is indistinguishable from neighboring image patches. In other words, the quantity $I(x,y)-I(x+\Delta x,y+\Delta y)$ should be small for small displacements $\Delta x$, $\Delta y$. For robustness, this should hold for all points $(u,v)$ in a window $W(x,y)$ around $(x,y)$. We express this by saying that the quantity $E_{x,y}(\Delta x, \Delta y)$ defined as follows:

$$ E_{x,y}(\Delta x, \Delta y) = \sum_{(u,v) \in W(x,y)} (I(u+\Delta x,v+\Delta y) - I(u,v))^2 $$

should be small in bad corners. We square the difference because we don't care about the sign (and quadratic functions are nicer that absolute values). We sum over all contributions of the window to account for the fact that the quantity should be small in all the neighborhood.

Conversely, $E$ should better have big values in a good corner!

For calculus experts, it is very tempting to express $I(u+\Delta x,v+\Delta y)$ as a Taylor expansion, so let's do that:

$$ \begin{aligned} I(u+\Delta x,v+\Delta y) &\approx I(u,v) + \nabla I(u,v) \cdot \begin{pmatrix} \Delta x \\ \Delta y \end{pmatrix} \\ &\approx I(u,v) + \begin{pmatrix} I_x(u,v) \\ I_y(u,v) \end{pmatrix}^{T} \begin{pmatrix} \Delta x \\ \Delta y \end{pmatrix} \end{aligned} $$

It follows that:

$$ \begin{aligned} E_{x,y}(\Delta x, \Delta y) &\approx \sum_{(u,v) \in W(x,y)} \left( \begin{pmatrix} I_x \\ I_y \end{pmatrix}^{T} \begin{pmatrix} \Delta x \\ \Delta y \end{pmatrix} \right)^2 \\ &\approx \sum_{(u,v) \in W(x,y)} \begin{pmatrix} \Delta x \\ \Delta y \end{pmatrix}^{T} \begin{pmatrix} I_x \\ I_y \end{pmatrix} \begin{pmatrix} I_x \\ I_y \end{pmatrix}^{T} \begin{pmatrix} \Delta x \\ \Delta y \end{pmatrix} \\ &\approx \sum_{(u,v) \in W(x,y)} \begin{pmatrix} \Delta x \\ \Delta y \end{pmatrix}^{T} \begin{pmatrix} I_x^{2} & I_x I_y \\ I_x I_y & I_y^{2} \end{pmatrix} \begin{pmatrix} \Delta x \\ \Delta y \end{pmatrix} \\ &\approx \begin{pmatrix} \Delta x \\ \Delta y \end{pmatrix}^{T} M \begin{pmatrix} \Delta x \\ \Delta y \end{pmatrix} \end{aligned} $$

Where the matrix $M$ is:

$$ \begin{aligned} M = \begin{pmatrix} \sum_{W} I_x^{2} & \sum_{W} I_x I_y \\ \sum_{W} I_x I_y & \sum_{W} I_y^{2} \end{pmatrix} \end{aligned} $$

What have we done? We have managed to express $E$, our measure of dissimilarity, as a quadratic form in the displacement $(\Delta x, \Delta y)$.

Of course, the value of $E$ will be zero when the displacement is zero (the patch is identical to itself). What we care about is whether the dissimilarity increases for small displacements, and in which directions.

Recall from linear algebra that $M$ can be diagonalized since it is symmetric. This allows expressing our approximation of $E$ in the form $E = \lambda_1 \Delta\tilde{x}^2 + \lambda_2 \Delta\tilde{y}^2$, where $\lambda_1$ and $\lambda_2$ are the eigenvalues of $M$ and $\Delta\tilde{x}$ and $\Delta\tilde{y}$ are the displacement values in the new base defined by the eigenvectors. Furthermore, it can be verified that it $M$ is semi-definite positive (it better be! since it is a non-negative quantity we are dealing with), so the eigenvalues are non-negative.

Thus, we can state that:

  1. If $\lambda_1$ and $\lambda_2$ are small, then $E$ is small, thus the image patch is similar to its neighbors and $(x,y)$ doesn't make for a good corner.
  2. If one of the eigenvalues is small and the other one is big, then $E$ increases in one direction and remains flat in the other. This indicates the presence of an edge.
  3. It both eigenvalues are big, then $E$ increases in any direction. This indicates a corner.

Practical matters

Since computing eigenvalues can be expensive, it is preferred to calculate the score $R = det(M)-k\cdot trace(M)^2$, where $k$ is an empirically determined constant, in order to compute the “cornerness” of a point. $R$ preserves information about the eigenvalues since $det(M) = \lambda_1\lambda_2$ and $trace(M) = \lambda_1 + \lambda_2$ but it is also easier to calculate.

MATLAB implementations can be found here and here.