手撕非極大值抑制演算法NMS

演算法 · 發表 2019-05-09 02:04:55

摘要：前言非極大值抑制演算法（Non-maximum suppression, NMS）是有anchor系列目標檢測的標配，如今大部分的 One-Stage 和 Two-Stage 演算法在推斷(Inference)階段都使用了NMS作為網路的最後一層，例如YOLOv3、SSD、Fa...

前言

非極大值抑制演算法（Non-maximum suppression, NMS）是有anchor系列目標檢測的標配，如今大部分的 One-Stage 和 Two-Stage 演算法在推斷(Inference)階段都使用了NMS作為網路的最後一層，例如YOLOv3、SSD、Faster-RCNN等。

當然NMS在目前最新的 anchor-free 目標檢測演算法中(CornerNet、CenterNet等)並不是必須的，對這種檢測演算法提升的精度也有限，但是這並不影響我們學習NMS。

NMS的本質是搜尋區域性極大值，抑制非極大值元素，在目標檢測中，我們經常將其用於消除多餘的檢測框(從左到右消除了重複的檢測框，只保留當前最大confidence的檢測框)：

NMS有很多種變體，這裡只介紹最為常見的Hard-NMS，我們通常所說的NMS就是指Hard-NMS。

Hard-NMS

Hard-NMS就是我們傳統意義上的NMS，也是最常用的NMS演算法。

因為NMS主要用於消除多餘的檢測框，那麼消除的標準是什麼，我們使用IOU作為標準來進行演示，IOU的原稱為Intersection over Union，也就是兩個box區域的交集比上並集，下圖可以方便理解：

具體介紹可以看這裡：深度學習中IU、IoU(Intersection over Union)的概念理解以及python程式實現

因為我們要手擼麼，所以廢話不多說，直接開始看程式碼，首先使用Pytorch來看一篇：

def hard_nms(box_scores, iou_threshold, top_k=-1, candidate_size=200):
"""
Args:
box_scores (N, 5): box的集合，N為框的數量，5即4(位置資訊)+1(可能為物體的概率)
iou_threshold: 我們用IOU標準去除多餘檢測框的閾值
top_k: 保留多少個計算後留下來的候選框，如果為-1則全保留
candidate_size: 參與計算的boxes數量
Returns:
picked: 經過nms計算後保留下來的box
"""
scores = box_scores[:, -1]# 首先我們取出box中的最後一個元素也就是當前box檢測到物體的概率
boxes = box_scores[:, :-1]# 取出box中的四個座標(左上、右下)
picked = []
_, indexes = scores.sort(descending=True) # 按照降序排列所有的物體的概率，得到排序後在原陣列中的索引資訊 indexes
indexes = indexes[:candidate_size]# 只保留前 candidate_size 個 boxes 其餘的不考慮了
while len(indexes) > 0:
current = indexes[0]# 每次取出當前在 indexes 中 檢測到物體概率最大的一個 
picked.append(current.item())# 將這個最大的存在結果中
if 0 < top_k == len(picked) or len(indexes) == 1:
break
current_box = boxes[current, :]# 當前第一個也就是最高概率的box
indexes = indexes[1:]
rest_boxes = boxes[indexes, :]# 剩下其餘的box
iou = iou_of(# 將當前的box與剩下其餘的boxes用IOU標準進行篩選
rest_boxes,
current_box.unsqueeze(0),
)
indexes = indexes[iou <= iou_threshold]# 保留與當前box的IOU小於一定閾值的boxes，

return box_scores[picked, :]

看了上面的程式碼，我們可以知道大概的流程：

選取這類box中scores最大的那一個，記為current_box，並保留它(為什麼保留它，因為它預測出當前位置有物體的概率最大啊，對於我們來說當前confidence越大說明當前box中包含物體的可能行就越大)
計算current_box與其餘的box的IOU
如果其IOU大於我們設定的閾值，那麼就捨棄這些boxes（由於可能這兩個box表示同一目標，因此這兩個box的IOU就比較大，會超過我們設定的閾值，所以就保留分數高的那一個）
從最後剩餘的boxes中，再找出最大scores的那一個(之前那個大的已經儲存到輸出的陣列中，這個是從剩下的裡面再挑一個最大的)，如此迴圈往復

至於上述程式碼中 iou_of 部分：

def area_of(left_top, right_bottom) -> torch.Tensor:
"""Compute the areas of rectangles given two corners.

Args:
left_top (N, 2): left top corner.
right_bottom (N, 2): right bottom corner.

Returns:
area (N): return the area.
"""
hw = torch.clamp(right_bottom - left_top, min=0.0)
return hw[..., 0] * hw[..., 1]


def iou_of(boxes0, boxes1, eps=1e-5):
"""Return intersection-over-union (Jaccard index) of boxes.

Args:
boxes0 (N, 4): ground truth boxes.
boxes1 (N or 1, 4): predicted boxes.
eps: a small number to avoid 0 as denominator.
Returns:
iou (N): IoU values.
"""
overlap_left_top = torch.max(boxes0[..., :2], boxes1[..., :2])
overlap_right_bottom = torch.min(boxes0[..., 2:], boxes1[..., 2:])

overlap_area = area_of(overlap_left_top, overlap_right_bottom)
area0 = area_of(boxes0[..., :2], boxes0[..., 2:])
area1 = area_of(boxes1[..., :2], boxes1[..., 2:])
return overlap_area / (area0 + area1 - overlap_area + eps)

手撕NMS

手撕程式碼用什麼撕，當然是用C++撕，這才爽麼！

直接看程式碼，其中使用了OpenCV庫中的 Point2f 結構體：

// 這是一個模板函式，接受一個已經排好序的vector，然後降序返回其索引
template <typename T>
vector<int> sort_indexes(const vector<T> &v) {


vector<int> idx(v.size());
iota(idx.begin(), idx.end(), 0);
sort(idx.begin(), idx.end(),
[&v](int i1, int i2) {return v[i1] > v[i2];});
return idx;
}
// 這就是我們的NMS函式 輸入的座標已經標準化，所有數值的範圍為 0-1
/*
* numBoxes：視窗數目
* points：視窗左上角座標點
* oppositePoints：視窗右下角座標點
* score：視窗得分
* overlapThreshold：重疊閾值控制
* numBoxesOut：輸出視窗數目
* pointsOut：輸出視窗左上角座標點
* oppositePoints：輸出視窗右下角座標點
* scoreOut：輸出視窗得分
*/

int nonMaximumSuppression(int numBoxes, const vector<Point2f>& points,
const vector<Point2f>& oppositePoints, const vector<float>& score,
float overlapThreshold,
int *numBoxesOut, vector<Point2f>& pointsOut,
vector<Point2f>& oppositePointsOut, vector<float>& scoreOut)
{

const float eps = 1e-5;
int i, j, index;
float* box_area = (float*)malloc(numBoxes * sizeof(float));// 定義視窗面積變數並分配空間
vector<int> indices;
int* is_suppressed = (int*)malloc(numBoxes * sizeof(int));// 定義是否抑制表標誌並分配空間

// 初始化indices、is_supperssed、box_area資訊
for (i = 0; i < numBoxes; i++)
{
indices.push_back(i);
is_suppressed[i] = 0;
box_area[i] = ((oppositePoints[i].x - points[i].x + eps) *
(oppositePoints[i].y - points[i].y + eps));
}

// 對輸入視窗按照分數比值進行排序，排序後的編號放在indices中
indices = sort_indexes(score);

for (i = 0; i < numBoxes; i++)// 迴圈所有視窗
{
if (!is_suppressed[indices[i]])// 判斷視窗是否被抑制
{
for (j = i + 1; j < numBoxes; j++)// 迴圈當前視窗之後的視窗
{
if (!is_suppressed[indices[j]])// 判斷視窗是否被抑制
{
float x1max = max(points[indices[i]].x, points[indices[j]].x);// 求兩個視窗左上角x座標最大值
float x2min = min(oppositePoints[indices[i]].x, oppositePoints[indices[j]].x);// 求兩個視窗右下角x座標最小值
float y1max = max(points[indices[i]].y, points[indices[j]].y);// 求兩個視窗左上角y座標最大值
float y2min = min(oppositePoints[indices[i]].y, oppositePoints[indices[j]].y);// 求兩個視窗右下角y座標最小值
float overlapWidth = x2min - x1max + eps;// 計算兩矩形重疊的寬度
float overlapHeight = y2min - y1max + eps;// 計算兩矩形重疊的高度
if (overlapWidth > 0 && overlapHeight > 0)
{
float overlapPart = (overlapWidth * overlapHeight) / box_area[indices[j]];// 計算重疊的比率
if (overlapPart > overlapThreshold)// 判斷重疊比率是否超過重疊閾值
{
is_suppressed[indices[j]] = 1;// 將視窗j標記為抑制
}
}
}
}
}
}

*numBoxesOut = 0;// 初始化輸出視窗數目0
for (i = 0; i < numBoxes; i++)
{
if (!is_suppressed[i]) (*numBoxesOut)++;// 統計輸出視窗數目
}

for (i = 0; i < numBoxes; i++)// 遍歷所有輸入視窗
{
if (!is_suppressed[indices[i]])// 將未發生抑制的視窗資訊儲存到輸出資訊中
{
Point2f temp_xy(points[indices[i]].x, points[indices[i]].y);
Point2f temp_Opxy(oppositePoints[indices[i]].x, oppositePoints[indices[i]].y);
pointsOut.push_back(temp_xy);
oppositePointsOut.push_back(temp_Opxy);
scoreOut.push_back(score[indices[i]]);
}
}

indices.clear();
free(box_area);// 釋放box_area空間
free(is_suppressed);// 釋放is_suppressed空間

return 0;
}

好了，程式碼撕完了，我們只需要輸入相應的資料就可以使用，親測。

但是要注意，此程式碼僅供學習演示，並沒有進行過多的優化，大家有能力者自行優化吧！

手撕非極大值抑制演算法NMS

前言

Hard-NMS

手撕NMS

您可能也會喜歡…