SSD 网络结构分析与默认框设计

最新推荐文章于 2025-01-12 23:09:06 发布

Gallant Hu

最新推荐文章于 2025-01-12 23:09:06 发布

阅读量467

点赞数

CC 4.0 BY-SA版权

分类专栏：计算机视觉

本文链接：https://blog.csdn.net/weixin_42108090/article/details/109269727

SSD模型采用default boxes与不同分辨率的特征图结合，预测相对偏移量而非边界框坐标。它使用VGG16作为特征提取网络，并在特定层设置不同尺度和宽高比的default box。每层预测包括类别得分和偏移量，通过NMS等方法输出检测结果。设计default box尺寸基于比例s_min和s_max，以及多种宽高比。在不同特征图层设置不同数量的default box，SSD总共预测8732个框。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

在这里插入图片描述

图1 SSD300 模型结构图

Default boxes and aspect ratios We associate a set of default bounding boxes with each feature map cell, for multiple feature maps at the top of the network. The default boxes tile the feature map in a convolutional manner, so that the position of each box relative to its corresponding cell is fixed. At each feature map cell(特征图上叫cell), we predict the offsets relative to the default box shapes in the cell, as well as the per-class scores that indicate the presence of a class instance in each of those boxes. Specifically, for each box out of k at a given location, we compute c class scores and the 4 offsets relative to the original default box shape. This results in a total of $(c + 4) k$ filters that are applied around each location in the feature map, yielding $(c + 4) k m n$ outputs for a m × n feature map.
For an illustration of default boxes, please refer to Fig. 1. Our default boxes are similar to the anchor boxes used in Faster R-CNN [2], however we apply them to several feature maps of different resolutions. Allowing different default box shapes in several feature maps let us efficiently discretize the space of possible output box shapes.

SSD模型并不是用边界框的中心位置坐标和宽高参与运算，而是用bounding box相对于default box的偏移量来进行运算。

anchor 是先验框，是固定的，网络学习的是相对于先验框的偏移量（offset）。

假设default box 的位置表示为 $d=(d_{cx}, d_{cy}, d_{w}, d_{h})$ , 对应的 bounding box 表示为 $b=(b_{cx}, b_{cy}, b_w, b_h)$ 其中 $c x, c y$ 为中心位置坐标， $w, h$ 为框的宽和高，则模型预测bounding box 的输出可以表示为t:
$t_x=(b_{cx}-d_{cx})/d_w$ $t_y=(b_{cy}-d_{cy})/d_h$ $tw=ln(bw/dw)t_w=ln($