关于yolov8的DFL模块(pytorch以及tensorrt)

lindsayshuo

已于 2024-12-04 08:36:34 修改

阅读量7.7k

点赞数 30

CC 4.0 BY-SA版权

文章标签： YOLO pytorch 人工智能

于 2024-03-07 10:45:17 首次发布

本文链接：https://blog.csdn.net/weixin_43269994/article/details/136524684

可以参考我改的项目：

https://github.com/lindsayshuo/yolov8-cls-tensorrtx

先看代码

class DFL(nn.Module):
    """
    Integral module of Distribution Focal Loss (DFL).

    Proposed in Generalized Focal Loss https://ieeexplore.ieee.org/document/9792391
    """

    def __init__(self, c1=16):
        """Initialize a convolutional layer with a given number of input channels."""
        super().__init__()
        self.conv = nn.Conv2d(c1, 1, 1, bias=False).requires_grad_(False)
        x = torch.arange(c1, dtype=torch.float)
        self.conv.weight.data[:] = nn.Parameter(x.view(1, c1, 1, 1))
        self.c1 = c1

    def forward(self, x):
        """Applies a transformer layer on input tensor 'x' and returns a tensor."""
        b, c, a = x.shape  # batch, channels, anchors

        print("self.conv.weight.data[:] is : ",self.conv.weight.data[:])
        print("self.conv.weight.data[:] shape is : ",self.conv.weight.data[:].shape)
        print("x is : ",x)
        print("x.shape is : ",x.shape)
        print("x.view(b,4,self.c1,a) is : ",x.view(b,4,self.c1,a))
        print("x.view(b,4,self.c1,a).shape is : ",x.view(b,4,self.c1,a).shape)

        return self.conv(x.view(b, 4, self.c1, a).transpose(2, 1).softmax(1)).view(b, 4, a)
        # return self.conv(x.view(b, self.c1, 4, a).softmax(1)).view(b, 4, a)

这个类 DFL 是一个神经网络模块，继承自 nn.Module，是在PyTorch框架中定义自定义神经网络层的标准方式。这个 DFL 类实现了分布焦点损失（Distribution Focal Loss, DFL），这是在论文 “Generalized Focal Loss” 中提出的一个概念。下面是对这段代码的详细解释：

1、class DFL(nn.Module)：定义了一个名为 DFL 的类，它继承自 nn.Module，使其成为一个PyTorch的网络层。
2、def init(self, c1=16)：DFL 类的初始化方法。接收一个参数 c1，默认值是 16，代表输入通道的数量。
3、super().init()：调用父类 nn.Module 的初始化函数，这是在定义PyTorch模型时的标准做法。
4、self.conv = nn.Conv2d(c1, 1, 1, bias=False).requires_grad_(False)：定义了一个卷积层，该层有 c1 个输入通道，1个输出通道，卷积核大小1x1，没有偏置项，且不需要梯度更新（即在训练过程中不会更新这个卷积层的权重）。
5、x = torch.arange(c1, dtype=torch.float)：创建一个大小为 c1 的一维张量，这个张量包含了从0到 c1-1 的连续整数。
6、self.conv.weight.data[:] = nn.Parameter(x.view(1, c1, 1, 1))：初始化卷积层的权重。x 被转换成形状为 (1, c1, 1, 1) 的四维张量，并作为卷积层权重的值。
7、self.c1 = c1：存储输入通道数目的属性。
8、def forward(self, x)：定义了模块的前向传播方法，其中 x 是输入张量。
9、b, c, a = x.shape：获取输入张量 x 的形状，假设其是三维的，其中 b 是批处理大小，c 是通道数量，a 是锚点数量（注：锚点通常用于目标检测任务中）。
10、这段代码中还包含了一些打印语句，用于输出卷积层的权重和输入张量的形状等调试信息。
11、return self.conv(x.view(b, 4, self.c1, a).transpose(2, 1).softmax(1)).view(b, 4, a)：这是前向