This is a page about »Histogram of Gradients«
Principles
Histogram of Oriented Gradient (HOG) is an image feature descriptor technique that counts occurences of gradient orientation in localized portions of an image. It generates a relatively dense grid and uses overlapping local contrast normalization for improved accuracy.
Algorithm consist of four steps, gradient computation, orientation binning, normalization and descriptor blocks, and object recognition. The first three are what this note will focus on.
First, calculate gradient using 1D centered point discrete derivative mask \([-1, 0, 1]\) and \([-1, 0, 1]^{T}\), 3x3 Sobel mask or diagonal mask.
the latter two generally perform more poorly in detecting human.
1def calculate_coarse_gradient(slice: np.ndarray, axis):
2 t = slice.copy()
3 cols_idx = [x for x in range(t.shape[axis])]
4 if axis == 0:
5 t = t[cols_idx[-2:] + cols_idx[:-2], :]
6 elif axis == 1:
7 t = t[:, cols_idx[-2:] + cols_idx[:-2]]
8 return slice - t
9
10def calculate_sobel_gradient(img, axis):
11 if axis == 0: dx, dy = 1, 0
12 else: dx, dy = 0, 1
13 return cv2.Sobel(img, cv2.CV_32F, dx, dy, ksize=1)
using coarse gradient has limitations as it is very sensitive to noise and can result in highly similar histograms across target image.
Next, create cell histograms by casting weighted vote for an orientation based histogram bin, based on the resulting gradient for each pixel within the cell. The histogram channels are evenly spread over 0 to 180 (unsigned gradient) or 0 to 360 degrees (signed gradient). Pixel contribution to the voted weight can be the gradient magnitude or some function of the gradient magnitude. The key is, votes are interpolated bilinearly between neighbouring bins centers in both orientation and position.
Unsigned gradient makes visualization easier, which is how
skimage
implements it. It extends the \(O\) orientations to the 3rd and 4th quardrant.
1# orientation calculation
2orientation = np.arctan2(grad_y, grad_x)
3orientation += np.pi/2 # add to ensure its parallel to edge
4
5# orientation binning
6def calculate_histogram(cell_size, magnitude, orientation):
7 histogram = np.zeros(orientation_bins, dtype=float)
8 bin_width = (_max - _min) / orientation_bins
9 for r in range(cell_size):
10 for c in range(cell_size):
11 pixel_magnitude = magnitude[r, c]
12 pixel_orientation = orientation[r, c]
13
14 if pixel_magnitude == 0:
15 continue
16
17 normalized_orientation = (pixel_orientation - _min) # why?
18 bin_index_float = normalized_orientation / bin_width
19
20 bin1_idx = int(np.floor(bin_index_float))
21 bin1_idx = bin1_idx % orientation_bins
22 bin2_idx = (bin1_idx + 1) % orientation_bins # Wrap around for the last bin
23
24 weight2 = bin_index_float - np.floor(bin_index_float)
25 weight1 = 1.0 - weight2
26
27 histogram[bin1_idx] += pixel_magnitude * weight1
28 histogram[bin2_idx] += pixel_magnitude * weight2
29
30 hist_norm = np.linalg.norm(histogram) # normalize
31 if hist_norm > 0:
32 histogram = histogram / hist_norm * (cell_size / 2)
33 return histogram
34
35# prepare lines for plot
36center_x_plot = sy + cell_size / 2.0
37center_y_plot = sx + cell_size / 2.0
38for iob in range(orientation_bins):
39 angle = plot_bins[iob]
40 length = histogram[iob]
41 end_x_rel = length * np.cos(angle)
42 end_y_rel = length * np.sin(angle)
43 if full:
44 line_start_x = center_x_plot
45 line_start_y = center_y_plot
46 line_end_x = center_x_plot + end_x_rel
47 line_end_y = center_y_plot + end_y_rel
48 else:
49 line_start_x = center_x_plot - end_x_rel / 2
50 line_start_y = center_y_plot - end_y_rel / 2
51 line_end_x = center_x_plot + end_x_rel / 2
52 line_end_y = center_y_plot + end_y_rel / 2
53 all_lines.append([(line_start_x, line_start_y), (line_end_x, line_end_y)])
References
Hannibunny’s blog provided an alternate reference in addition to skimage.