1.IOU
代码地址:https://arxiv.org/pdf/1608.01471.pdf
- IOU 的全称为交并比(Intersection over Union), IOU 计算的是 “预测的边框” 和 “真实的边框” 的交集和并集的比值。首次提出主要解决人脸检测问题,IOU有尺度不变性的优点,但是将其作为损失函数,由于存在真实值与预测值IOU为0情况,网络无法训练,所以无法直接使用。
def IOU(box1, box2):
"""
iou loss
:param box1: tensor [batch, w, h, num_anchor, 4], xywh 预测值
:param box2: tensor [batch, w, h, num_anchor, 4], xywh 真实值
:return: tensor [batch, w, h, num_anchor, 1]
"""
box1_xy, box1_wh = box1[..., :2], box1[..., 2:4]
box1_wh_half = box1_wh / 2.
box1_mines = box1_xy - box1_wh_half
box1_maxes = box1_xy + box1_wh_half
box2_xy, box2_wh = box2[..., :2], box2[..., 2:4]
box2_wh_half = box2_wh / 2.
box2_mines = box2_xy - box2_wh_half
box2_maxes = box2_xy + box2_wh_half
# 求真实值和预测值所有的iou
intersect_mines = torch.max(box1_mines, box2_mines)
intersect_maxes = torch.min(box1_maxes, box2_maxes)
intersect_wh = torch.max(intersect_maxes-intersect_mines, torch.zeros_like(intersect_maxes))
intersect_area = intersect_wh[..., 0]*intersect_wh[..., 1]
box1_area = box1_wh[..., 0]*box1_wh[..., 1]
box2_area = box2_wh[..., 0]*box2_wh[..., 1]
union_area = box1_area+box2_area-intersect_area
iou = intersect_area / torch.clamp(union_area, min=1e-6)
return iou
2.GIOU
代码地址:https://arxiv.org/abs/1902.09630
- 在目标检测任务中,回归loss相同的情况下,IoU却可能大不相同。如下图a,l2-loss相同,但IOU的值却完全不同, 而图b,l1-loss相同,但IOU的值也完全不太。同时,IOU会存在值为 0 的情况,loss为 0,导致网络无法训练。GIOU Loss的目的就是解决IoU Loss中当**“真实值”与“预测值”**不相交时,Loss为0的问题。
- GIoU具有一下特性:
1.任意真实值、预测值都存在,GIOU<=IOU
2.-1< GIOU <=1, 当IOU等于1时,GIoU也等于1
相比IOU Loss,GIOU Loss在任意情况下都可以进行训练。
- GIoU具有一下特性:
def GIOU(box1, box2):
"""
giou loss
:param box1: tensor [batch, w, h, num_anchor, 4], xywh 预测值
:param box2: tensor [batch, w, h, num_anchor, 4], xywh 真实值
:return: tensor [batch, w, h, num_anchor, 1]
"""
b1_x1, b1_x2 = box1[..., 0] - box1[..., 2] / 2, box1[..., 0] + box1[..., 2] / 2
b1_y1, b1_y2 = box1[..., 1] - box1[..., 3] / 2, box1[..., 1] + box1[..., 3] / 2
b2_x1, b2_x2 = box2[..., 0] - box2[..., 2] / 2, box2[..., 0] + box2[..., 2] / 2
b2_y1, b2_y2 = box2[..., 1] - box2[..., 3] / 2, box2[..., 1] + box2[..., 3] / 2
box1_xy, box1_wh = box1[..., :2], box1[..., 2:4]
box1_wh_half = box1_wh / 2.
box1_mines = box1_xy - box1_wh_half
box1_maxes = box1_xy + box1_wh_half
box2_xy, box2_wh = box2[..., :2], box2[..., 2:4]
box2_wh_half = box2_wh / 2.
box2_mines = box2_xy - box2_wh_half
box2_maxes = box2_xy + box2_wh_half
# 求真实值和预测值所有的iou
intersect_mines = torch.max(box1_mines, box2_mines)
intersect_maxes = torch.min(box1_maxes, box2_maxes)
intersect_wh = torch.max(intersect_maxes-intersect_mines, torch.zeros_like(intersect_maxes))
intersect_area = intersect_wh[..., 0]*intersect_wh[..., 1]
box1_area = box1_wh[..., 0]*box1_wh[..., 1]
box2_area = box2_wh[..., 0]*box2_wh[..., 1]
union_area = box1_area+box2_area-intersect_area
iou = intersect_area / torch.clamp(union_area, min=1e-6)
# 计算最小包围框的宽和高
cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)
ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)
c_area = cw * ch + 1e-16 # convex area
return iou - (c_area - union_area) / c_area
3.DIOU
代码地址:https://arxiv.org/pdf/1911.08287.pdf
- 如上场景时,计算各种基于IOU的损失函数值,L-diou是能够描述检测框与gt框的位置信息,上图中,我们可以看到,当处于第三幅图时,检测框和gt都位于中心,iou=giou=diou,但是,位置出现差异,如第一幅图和第二幅图时,明显第二幅图的效果要好一些,此时GIOU降级成为IOU,而DIOU的损失值较大,能够较好的描述当前位置信息。
- DIOU的优点如下:
1、与GIOU loss类似,DIOU loss在与目标框不重叠时,仍然可以为边界框提供移动方向。
2、当边界框完全匹配时,IOU=GIOU=DIOU=0, 当相距很远时,GIOU=DIOU > 2
3、DIOU loss可以直接最小化两个目标框的距离,而GIOU loss优化的是两个目标框之间的面积,因此比GIOU loss收敛快得多。
4、对于包含两个框在水平方向和垂直方向上这种情况,DIOU损失可以使回归非常快,而GIOU损失几乎退化为IOU损失
- DIOU的优点如下:
def DIOU(box1, box2):
"""
diou loss
:param box1: tensor [batch, w, h, num_anchor, 4], xywh 预测值
:param box2: tensor [batch, w, h, num_anchor, 4], xywh 真实值
:return: tensor [batch, w, h, num_anchor, 1]
"""
b1_x1, b1_x2 = box1[..., 0] - box1[..., 2] / 2, box1[..., 0] + box1[..., 2] / 2
b1_y1, b1_y2 = box1[..., 1] - box1[..., 3] / 2, box1[..., 1] + box1[..., 3] / 2
b2_x1, b2_x2 = box2[..., 0] - box2[..., 2] / 2, box2[..., 0] + box2[..., 2] / 2
b2_y1, b2_y2 = box2[..., 1] - box2[..., 3] / 2, box2[..., 1] + box2[..., 3] / 2
box1_xy, box1_wh = box1[..., :2], box1[..., 2:4]
box1_wh_half = box1_wh / 2.
box1_mines = box1_xy - box1_wh_half
box1_maxes = box1_xy + box1_wh_half
box2_xy, box2_wh = box2[..., :2], box2[..., 2:4]
box2_wh_half = box2_wh / 2.
box2_mines = box2_xy - box2_wh_half
box2_maxes = box2_xy + box2_wh_half
# 求真实值和预测值所有的iou
intersect_mines = torch.max(box1_mines, box2_mines)
intersect_maxes = torch.min(box1_maxes, box2_maxes)
intersect_wh = torch.max(intersect_maxes-intersect_mines, torch.zeros_like(intersect_maxes))
intersect_area = intersect_wh[..., 0]*intersect_wh[..., 1]
box1_area = box1_wh[..., 0]*box1_wh[..., 1]
box2_area = box2_wh[..., 0]*box2_wh[..., 1]
union_area = box1_area+box2_area-intersect_area
iou = intersect_area / torch.clamp(union_area, min=1e-6)
# 计算最小包围框的宽和高
cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1) # convex (smallest enclosing box) width
ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)
c2 = cw ** 2 + ch ** 2 + 1e-16
# 两个框中心点距离的平方
rho2 = ((b2_x1 + b2_x2) - (b1_x1 + b1_x2)) ** 2 / 4 + ((b2_y1 + b2_y2) - (b1_y1 + b1_y2)) ** 2 / 4
return iou - rho2 / c2
4.CIOU
- 预测的bbox的三个重要的因素分别是,重叠面积、中心点距离和纵横比(长宽比)。IOU考虑了重叠区域,而GIOU很大程度上依赖了iou损失,DIOU则同时考虑了重叠区域和中心点距离,更进一步的,边界框的长宽比的一致性也是一个重要的几何因素。因此,基于DIOU,我们通过施加长宽比的一致性来提出了CIOU-Loss。
- DIOU的优点:
- 在回归时能够更好的描述重叠信息
- 论文中实验得出在使用CIOU后,再使用DIOU-nms,效果相当棒。
- DIOU的优点:
def CIOU(box1, box2):
"""
ciou loss
:param box1: tensor [batch, w, h, num_anchor, 4], xywh 预测值
:param box2: tensor [batch, w, h, num_anchor, 4], xywh 真实值
:return: tensor [batch, w, h, num_anchor, 1]
"""
box1_xy, box1_wh = box1[..., :2], box1[..., 2:4]
box1_wh_half = box1_wh / 2.
box1_mines = box1_xy - box1_wh_half
box1_maxes = box1_xy + box1_wh_half
box2_xy, box2_wh = box2[..., :2], box2[..., 2:4]
box2_wh_half = box2_wh / 2.
box2_mines = box2_xy - box2_wh_half
box2_maxes = box2_xy + box2_wh_half
# 求真实值和预测值所有的iou
intersect_mines = torch.max(box1_mines, box2_mines)
intersect_maxes = torch.min(box1_maxes, box2_maxes)
intersect_wh = torch.max(intersect_maxes-intersect_mines, torch.zeros_like(intersect_maxes))
intersect_area = intersect_wh[..., 0]*intersect_wh[..., 1]
box1_area = box1_wh[..., 0]*box1_wh[..., 1]
box2_area = box2_wh[..., 0]*box2_wh[..., 1]
union_area = box1_area+box2_area-intersect_area
iou = intersect_area / torch.clamp(union_area, min=1e-6)
# 计算中心的差距
center_distance = torch.sum(torch.pow((box1_xy-box2_xy), 2), dim=-1)
# 找到包裹两个框的最小框的左上角和右下角
enclose_mines = torch.min(box1_mines, box2_mines)
enclose_maxes = torch.max(box1_maxes, box2_maxes)
enclose_wh = torch.max(enclose_maxes-enclose_mines, torch.zeros_like(intersect_maxes))
# 计算对角线距离
enclose_diagonal = torch.sum(torch.pow(enclose_wh, 2), dim=-1)
ciou = iou - 1. * center_distance / torch.clamp(enclose_diagonal, min=1e-6)
v = (4/(math.pi**2))*torch.pow((torch.atan(box1_wh[..., 0]/torch.clamp(box1_wh[..., 1], min=1e-6))-torch.atan(box2_wh[..., 0]/torch.clamp(box2_wh[..., 1], min=1e-6))), 2)
alpha = v / torch.clamp((1.-iou+v), min=1e-6)
ciou = ciou - alpha * v
return ciou