YOLOv8目标检测创新改进与实战案例专栏
专栏目录: YOLOv8有效改进系列及项目实战目录 包含卷积,主干 注意力,检测头等创新机制 以及 各种目标检测分割项目实战案例
专栏链接: YOLOv8基础解析+创新改进+实战案例
文章目录
介绍
摘要
近年来,基于深度学习的人脸检测算法已取得重大进展。这些算法通常可分为两类,即二阶段检测器如Faster R-CNN和单阶段检测器如YOLO。由于在精度和速度之间取得了更好的平衡,单阶段检测器已在许多应用中被广泛使用。本文中,我们提出了一种基于单阶段检测器YOLOv5的实时人脸检测器,命名为YOLO-FaceV2。我们设计了一个称为RFE的感受野增强模块来增强小脸部的感受野,并使用NWD Loss来弥补IoU对微小物体位置偏差的敏感性。针对面部遮挡问题,我们提出了一个名为SEAM的注意力模块,并引入排斥损失来解决它。此外,我们使用一个称为Slide的权重函数来解决简单和困难样本之间的不平衡,并使用有效感受野的信息来设计锚点。在WiderFace数据集上的实验结果显示,我们的面部检测器性能超过了YOLO及其变体,在所有简单、中等和困难的子集中都有所体现。源代码可在https://github.com/Krasjet-Yu/YOLO-FaceV2找到。
文章链接
论文地址:论文地址
代码地址:代码地址
基本原理
SEAM(Separated and Enhanced Attention Module)是YOLO-FaceV2中引入的一个模块,旨在增强面部特征的学习能力,特别是在面部遮挡的情况下。
- 多头注意力机制:SEAM模块采用了多头注意力机制,旨在强调图像中的面部区域,同时抑制背景区域。这种机制使得模型能够更好地关注到重要的面部特征,从而提高检测的准确性。
- 深度可分离卷积:SEAM的第一部分使用深度可分离卷积,这种卷积方式是逐通道进行的,能够有效减少参数数量,同时学习不同通道的重要性。通过这种方式,模型能够提取出更具代表性的特征。
- 通道关系的学习:虽然深度可分离卷积在减少参数方面表现良好,但它可能忽略通道之间的信息关系。为了解决这个问题,SEAM在深度卷积的输出后,使用1x1卷积进行点对点的组合,以增强通道之间的联系。这一过程有助于模型在遮挡场景中更好地理解被遮挡面部与未遮挡面部之间的关系。
- 全连接网络的融合:在通道关系学习之后,SEAM使用一个两层的全连接网络来融合每个通道的信息。这一过程进一步加强了通道之间的连接,使得模型能够更有效地处理面部遮挡问题。
- 指数归一化:SEAM模块的输出经过指数函数处理,将值范围从[0, 1]扩展到[1, e]。这种指数归一化提供了一种单调映射关系,使得结果对位置误差更加宽容,从而提高了模型的鲁棒性。
- 注意力机制的应用:最后,SEAM模块的输出被用作注意力权重,与原始特征相乘。这一过程使得模型能够更有效地处理面部遮挡,提高了对被遮挡面部的响应能力。
核心代码
class SEAM(nn.Module):
def __init__(self, c1, c2, n, reduction=16):
super(SEAM, self).__init__()
if c1 != c2:
c2 = c1
self.DCovN = nn.Sequential(
# nn.Conv2d(c1, c2, kernel_size=3, stride=1, padding=1, groups=c1),
# nn.GELU(),
# nn.BatchNorm2d(c2),
*[nn.Sequential(
Residual(nn.Sequential(
nn.Conv2d(in_channels=c2, out_channels=c2, kernel_size=3, stride=1, padding=1, groups=c2),
nn.GELU(),
nn.BatchNorm2d(c2)
)),
nn.Conv2d(in_channels=c2, out_channels=c2, kernel_size=1, stride=1, padding=0, groups=1),
nn.GELU(),
nn.BatchNorm2d(c2)
) for i in range(n)]
)
self.avg_pool = torch.nn.AdaptiveAvgPool2d(1)
self.fc = nn.Sequential(
nn.Linear(c2, c2 // reduction, bias=False),
nn.ReLU(inplace=True),
nn.Linear(c2 // reduction, c2, bias=False),
nn.Sigmoid()
)
self._initialize_weights()
# self.initialize_layer(self.avg_pool)
self.initialize_layer(self.fc)
def forward(self, x):
b, c, _, _ = x.size()
y = self.DCovN(x)
y = self.avg_pool(y).view(b, c)
y = self.fc(y).view(b, c, 1, 1)
y = torch.exp(y)
return x * y.expand_as(x)
def _initialize_weights(self):
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.xavier_uniform_(m.weight, gain=1)
elif isinstance(m, nn.BatchNorm2d):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
def initialize_layer(self, layer):
if isinstance(layer, (nn.Conv2d, nn.Linear)):
torch.nn.init.normal_(layer.weight, mean=0., std=0.001)
if layer.bias is not None:
torch.nn.init.constant_(layer.bias, 0)
下载YoloV8代码
直接下载
Git Clone
git clone https://github.com/ultralytics/ultralytics
安装环境
进入代码根目录并安装依赖。
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
在最新版本中,官方已经废弃了requirements.txt
文件,转而将所有必要的代码和依赖整合进了ultralytics
包中。因此,用户只需安装这个单一的ultralytics
库,就能获得所需的全部功能和环境依赖。
pip install ultralytics
引入代码
在根目录下的ultralytics/nn/
目录,新建一个attention
目录,然后新建一个以 SEAM
为文件名的py文件, 把代码拷贝进去。
import torch
import torch.nn as nn
class Residual(nn.Module):
def __init__(self, fn):
super(Residual, self).__init__()
self.fn = fn
def forward(self, x):
return self.fn(x) + x
class SEAM(nn.Module):
def __init__(self, c1, n=1, reduction=16):
super(SEAM, self).__init__()
c2 = c1
self.DCovN = nn.Sequential(
# nn.Conv2d(c1, c2, kernel_size=3, stride=1, padding=1, groups=c1),
# nn.GELU(),
# nn.BatchNorm2d(c2),
*[nn.Sequential(
Residual(nn.Sequential(
nn.Conv2d(in_channels=c2, out_channels=c2, kernel_size=3, stride=1, padding=1, groups=c2),
nn.GELU(),
nn.BatchNorm2d(c2)
)),
nn.Conv2d(in_channels=c2, out_channels=c2, kernel_size=1, stride=1, padding=0, groups=1),
nn.GELU(),
nn.BatchNorm2d(c2)
) for i in range(n)]
)
self.avg_pool = torch.nn.AdaptiveAvgPool2d(1)
self.fc = nn.Sequential(
nn.Linear(c2, c2 // reduction, bias=False),
nn.ReLU(inplace=True),
nn.Linear(c2 // reduction, c2, bias=False),
nn.Sigmoid()
)
self._initialize_weights()
# self.initialize_layer(self.avg_pool)
self.initialize_layer(self.fc)
def forward(self, x):
b, c, _, _ = x.size()
y = self.DCovN(x)
y = self.avg_pool(y).view(b, c)
y = self.fc(y).view(b, c, 1, 1)
y = torch.exp(y)
return x * y.expand_as(x)
def _initialize_weights(self):
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.xavier_uniform_(m.weight, gain=1)
elif isinstance(m, nn.BatchNorm2d):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
def initialize_layer(self, layer):
if isinstance(layer, (nn.Conv2d, nn.Linear)):
torch.nn.init.normal_(layer.weight, mean=0., std=0.001)
if layer.bias is not None:
torch.nn.init.constant_(layer.bias, 0)
注册
在ultralytics/nn/tasks.py
中进行如下操作:
步骤1:
from ultralytics.nn.attention.SEAM import SEAM
步骤2
修改def parse_model(d, ch, verbose=True)
:
elif m in {SEAM}:
c2 = ch[f]
args = [c2, *args]
配置yolov8-SEAM.yaml
ultralytics/cfg/models/v8/yolov8-SEAM.yaml
# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect
# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
# [depth, width, max_channels]
n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPs
s: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPs
m: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPs
l: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
x: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs
# YOLOv8.0n backbone
backbone:
# [from, repeats, module, args]
- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
- [-1, 3, C2f, [128, True]]
- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
- [-1, 6, C2f, [256, True]]
- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
- [-1, 6, C2f, [512, True]]
- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
- [-1, 3, C2f, [1024, True]]
- [-1, 1, SPPF, [1024, 5]] # 9
# YOLOv8.0n head
head:
- [-1, 1, nn.Upsample, [None, 2, 'nearest']]
- [[-1, 6], 1, Concat, [1]] # cat backbone P4
- [-1, 3, C2f, [512]] # 12
- [-1, 1, nn.Upsample, [None, 2, 'nearest']]
- [[-1, 4], 1, Concat, [1]] # cat backbone P3
- [-1, 3, C2f, [256]] # 15 (P3/8-small)
- [-1, 1, SEAM, []]
- [-1, 1, Conv, [256, 3, 2]]
- [[-1, 12], 1, Concat, [1]] # cat head P4
- [-1, 3, C2f, [512]] # 18 (P4/16-medium)
- [-1, 1, SEAM, []]
- [-1, 1, Conv, [512, 3, 2]]
- [[-1, 9], 1, Concat, [1]] # cat head P5
- [-1, 3, C2f, [1024]] # 21 (P5/32-large)
- [-1, 1, SEAM, []]
- [[16, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)
实验
脚本
import os
from ultralytics import YOLO
yaml = 'ultralytics/cfg/models/v8/yolov8-SEAM.yaml'
model = YOLO(yaml)
model.info()
if __name__ == "__main__":
results = model.train(data='coco128.yaml',
name='SEAM',
epochs=10,
workers=8,
batch=1,
)