各种IOU-loss的计算方式及python实现

本文链接：https://blog.csdn.net/ffllyy2019/article/details/117389747

本文介绍了IoU、GIOU、DIOU和CIOU四个目标检测中常用的损失函数，它们分别解决IoU存在的问题，如尺度不变性、不重合训练难等。CIOU综合了重叠、中心距离和长宽比一致性，提升检测精度。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1.IOU

代码地址：https://arxiv.org/pdf/1608.01471.pdf

IOU 的全称为交并比（Intersection over Union）， IOU 计算的是 “预测的边框” 和 “真实的边框” 的交集和并集的比值。首次提出主要解决人脸检测问题，IOU有尺度不变性的优点，但是将其作为损失函数，由于存在真实值与预测值IOU为0情况，网络无法训练，所以无法直接使用。

def IOU(box1, box2):
    """
    iou loss
    :param box1: tensor [batch, w, h, num_anchor, 4], xywh 预测值
    :param box2: tensor [batch, w, h, num_anchor, 4], xywh 真实值
    :return: tensor [batch, w, h, num_anchor, 1]
    """
    box1_xy, box1_wh = box1[..., :2], box1[..., 2:4]
    box1_wh_half = box1_wh / 2.
    box1_mines = box1_xy - box1_wh_half
    box1_maxes = box1_xy + box1_wh_half

    box2_xy, box2_wh = box2[..., :2], box2[..., 2:4]
    box2_wh_half = box2_wh / 2.
    box2_mines = box2_xy - box2_wh_half
    box2_maxes = box2_xy + box2_wh_half

    # 求真实值和预测值所有的iou
    intersect_mines = torch.max(box1_mines, box2_mines)
    intersect_maxes = torch.min(box1_maxes, box2_maxes)
    intersect_wh = torch.max(intersect_maxes-intersect_mines, torch.zeros_like(intersect_maxes))
    intersect_area = intersect_wh[..., 0]*intersect_wh[..., 1]
    box1_area = box1_wh[..., 0]*box1_wh[..., 1]
    box2_area = box2_wh[..., 0]*box2_wh[..., 1]
    union_area = box1_area+box2_area-intersect_area
    iou = intersect_area / torch.clamp(union_area, min=1e-6)
    return iou

2.GIOU

代码地址：https://arxiv.org/abs/1902.09630

在目标检测任务中，回归loss相同的情况下，IoU却可能大不相同。如下图a，l2-loss相同，但IOU的值却完全不同，而图b，l1-loss相同，但IOU的值也完全不太。同时，IOU会存在值为 0 的情况，loss为 0，导致网络无法训练。GIOU Loss的目的就是解决IoU Loss中当**“真实值”与“预测值”**不相交时，Loss为0的问题。
- GIoU具有一下特性：
  1.任意真实值、预测值都存在，GIOU<=IOU
  2.-1< GIOU <=1, 当IOU等于1时，GIoU也等于1
  相比IOU Loss，GIOU Loss在任意情况下都可以进行训练。

def GIOU(box1, box2):
	"""
    giou loss
    :param box1: tensor [batch, w, h, num_anchor, 4], xywh 预测值
    :param box2: tensor [batch, w, h, num_anchor, 4], xywh 真实值
    :return: tensor [batch, w, h, num_anchor, 1]
    """
    b1_x1, b1_x2 = box1[..., 0] - box1[..., 2] / 2, box1[..., 0] + box1[..., 2] / 2
    b1_y1, b1_y2 = box1[..., 1] - box1[..., 3] / 2, box1[..., 1] + box1[..., 3] / 2
    b2_x1, b2_x2 = box2[..., 0] - box2[..., 2] / 2, box2[..., 0] + box2[..., 2] / 2
    b2_y1, b2_y2 = box2[..., 1] - box2[..., 3] / 2, box2[..., 1] + box2[..., 3] / 2
    
    box1_xy, box1_wh = box1[..., :2], box1[..., 2:4]
    box1_wh_half = box1_wh / 2.
    box1_mines = box1_xy - box1_wh_half
    box1_maxes = box1_xy + box1_wh_half

    box2_xy, box2_wh = box2[..., :2], box2[..., 2:4]
    box2_wh_half = box2_wh / 2.
    box2_mines = box2_xy - box2_wh_half
    box2_maxes = box2_xy + box2_wh_half

    # 求真实值和预测值所有的iou
    intersect_mines = torch.max(box1_mines, box2_mines)
    intersect_maxes = torch.min(box1_maxes, box2_maxes)
    intersect_wh = torch.max(intersect_maxes-intersect_mines, torch.zeros_like(intersect_maxes))
    intersect_area = intersect_wh[..., 0]*intersect_wh[..., 1]
    box1_area = box1_wh[..., 0]*box1_wh[..., 1]
    box2_area = box2_wh[..., 0]*box2_wh[..., 1]
    union_area = box1_area+box2_area-intersect_area
    iou = intersect_area / torch.clamp(union_area, min=1e-6)
	
	# 计算最小包围框的宽和高
	cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1) 
    ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)
    c_area = cw * ch + 1e-16  # convex area
    return iou - (c_area - union_area) / c_area

3.DIOU

代码地址：https://arxiv.org/pdf/1911.08287.pdf

如上场景时，计算各种基于IOU的损失函数值，L-diou是能够描述检测框与gt框的位置信息，上图中，我们可以看到，当处于第三幅图时，检测框和gt都位于中心，iou=giou=diou，但是，位置出现差异，如第一幅图和第二幅图时，明显第二幅图的效果要好一些，此时GIOU降级成为IOU，而DIOU的损失值较大，能够较好的描述当前位置信息。
- DIOU的优点如下：
  1、与GIOU loss类似，DIOU loss在与目标框不重叠时，仍然可以为边界框提供移动方向。
  2、当边界框完全匹配时，IOU=GIOU=DIOU=0，当相距很远时，GIOU=DIOU > 2
  3、DIOU loss可以直接最小化两个目标框的距离，而GIOU loss优化的是两个目标框之间的面积，因此比GIOU loss收敛快得多。
  4、对于包含两个框在水平方向和垂直方向上这种情况，DIOU损失可以使回归非常快，而GIOU损失几乎退化为IOU损失

在这里插入图片描述

def DIOU(box1, box2):
    """
    diou loss
    :param box1: tensor [batch, w, h, num_anchor, 4], xywh 预测值
    :param box2: tensor [batch, w, h, num_anchor, 4], xywh 真实值
    :return: tensor [batch, w, h, num_anchor, 1]
    """
    b1_x1, b1_x2 = box1[..., 0] - box1[..., 2] / 2, box1[..., 0] + box1[..., 2] / 2
    b1_y1, b1_y2 = box1[..., 1] - box1[..., 3] / 2, box1[..., 1] + box1[..., 3] / 2
    b2_x1, b2_x2 = box2[..., 0] - box2[..., 2] / 2, box2[..., 0] + box2[..., 2] / 2
    b2_y1, b2_y2 = box2[..., 1] - box2[..., 3] / 2, box2[..., 1] + box2[..., 3] / 2
    
    box1_xy, box1_wh = box1[..., :2], box1[..., 2:4]
    box1_wh_half = box1_wh / 2.
    box1_mines = box1_xy - box1_wh_half
    box1_maxes = box1_xy + box1_wh_half

    box2_xy, box2_wh = box2[..., :2], box2[..., 2:4]
    box2_wh_half = box2_wh / 2.
    box2_mines = box2_xy - box2_wh_half
    box2_maxes = box2_xy + box2_wh_half

    # 求真实值和预测值所有的iou
    intersect_mines = torch.max(box1_mines, box2_mines)
    intersect_maxes = torch.min(box1_maxes, box2_maxes)
    intersect_wh = torch.max(intersect_maxes-intersect_mines, torch.zeros_like(intersect_maxes))
    intersect_area = intersect_wh[..., 0]*intersect_wh[..., 1]
    box1_area = box1_wh[..., 0]*box1_wh[..., 1]
    box2_area = box2_wh[..., 0]*box2_wh[..., 1]
    union_area = box1_area+box2_area-intersect_area
    iou = intersect_area / torch.clamp(union_area, min=1e-6)
	
	# 计算最小包围框的宽和高
	cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)  # convex (smallest enclosing box) width
    ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)
	c2 = cw ** 2 + ch ** 2 + 1e-16
    
    # 两个框中心点距离的平方
    rho2 = ((b2_x1 + b2_x2) - (b1_x1 + b1_x2)) ** 2 / 4 + ((b2_y1 + b2_y2) - (b1_y1 + b1_y2)) ** 2 / 4
    return iou - rho2 / c2

4.CIOU

预测的bbox的三个重要的因素分别是，重叠面积、中心点距离和纵横比（长宽比）。IOU考虑了重叠区域，而GIOU很大程度上依赖了iou损失，DIOU则同时考虑了重叠区域和中心点距离，更进一步的，边界框的长宽比的一致性也是一个重要的几何因素。因此，基于DIOU，我们通过施加长宽比的一致性来提出了CIOU-Loss。
- DIOU的优点：
  - 在回归时能够更好的描述重叠信息
  - 论文中实验得出在使用CIOU后，再使用DIOU-nms，效果相当棒。

def CIOU(box1, box2):
    """
    ciou loss
    :param box1: tensor [batch, w, h, num_anchor, 4], xywh 预测值
    :param box2: tensor [batch, w, h, num_anchor, 4], xywh 真实值
    :return: tensor [batch, w, h, num_anchor, 1]
    """
    box1_xy, box1_wh = box1[..., :2], box1[..., 2:4]
    box1_wh_half = box1_wh / 2.
    box1_mines = box1_xy - box1_wh_half
    box1_maxes = box1_xy + box1_wh_half

    box2_xy, box2_wh = box2[..., :2], box2[..., 2:4]
    box2_wh_half = box2_wh / 2.
    box2_mines = box2_xy - box2_wh_half
    box2_maxes = box2_xy + box2_wh_half

    # 求真实值和预测值所有的iou
    intersect_mines = torch.max(box1_mines, box2_mines)
    intersect_maxes = torch.min(box1_maxes, box2_maxes)
    intersect_wh = torch.max(intersect_maxes-intersect_mines, torch.zeros_like(intersect_maxes))
    intersect_area = intersect_wh[..., 0]*intersect_wh[..., 1]
    box1_area = box1_wh[..., 0]*box1_wh[..., 1]
    box2_area = box2_wh[..., 0]*box2_wh[..., 1]
    union_area = box1_area+box2_area-intersect_area
    iou = intersect_area / torch.clamp(union_area, min=1e-6)

    # 计算中心的差距
    center_distance = torch.sum(torch.pow((box1_xy-box2_xy), 2), dim=-1)

    # 找到包裹两个框的最小框的左上角和右下角
    enclose_mines = torch.min(box1_mines, box2_mines)
    enclose_maxes = torch.max(box1_maxes, box2_maxes)
    enclose_wh = torch.max(enclose_maxes-enclose_mines, torch.zeros_like(intersect_maxes))

    # 计算对角线距离
    enclose_diagonal = torch.sum(torch.pow(enclose_wh, 2), dim=-1)
    ciou = iou - 1. * center_distance / torch.clamp(enclose_diagonal, min=1e-6)

    v = (4/(math.pi**2))*torch.pow((torch.atan(box1_wh[..., 0]/torch.clamp(box1_wh[..., 1], min=1e-6))-torch.atan(box2_wh[..., 0]/torch.clamp(box2_wh[..., 1], min=1e-6))), 2)
    alpha = v / torch.clamp((1.-iou+v), min=1e-6)
    ciou = ciou - alpha * v
    return ciou