STM32部署神经网络-pytorch

雷古小狮子

已于 2024-08-05 21:16:49 修改

阅读量1.7k

点赞数 2

CC 4.0 BY-SA版权

文章标签： stm32 神经网络 pytorch

于 2024-07-17 21:25:05 首次发布

本文链接：https://blog.csdn.net/qq_45876576/article/details/140505857

训练模型

在STM32微控制器上部署神经网络时，模型大小是一个重要的考虑因素，因为微控制器的存储和计算资源有限。以下是一些减小模型大小和优化模型以适应STM32微控制器的策略：

模型架构选择：
选择轻量级的神经网络架构，如MobileNet、SqueezeNet或Tiny YOLO，这些架构专为资源受限的设备设计。

本次选择轻量级网络MobleNetV1，MobileNet 是一个为移动和嵌入式视觉应用设计的卷积神经网络（CNN）架构。它由 Google 开发，旨在在保持高准确率的同时减少计算资源的消耗。MobileNet 特别适合在计算能力有限的设备上运行，比如智能手机和嵌入式系统。

修建模型

参数剪枝：
在训练过程中或训练后，剪除不重要的权重或神经元，减少模型的参数数量。
非结构化剪枝：通过正则化技术（如L1正则化）或后训练剪枝策略来实现。
结构化剪枝：通过分析模型的通道或层的重要性，并移除不重要的部分来实现。
增量剪枝：逐步增加剪枝比例，逐步优化模型。
动态剪枝：在模型推理时动态地移除或保留网络的部分。参考的文章：
http://t.csdnimg.cn/AdPYh
http://t.csdnimg.cn/yhkSa
本文选用结构化剪枝，因为它可以显著减少计算量和内存占用。（非结构化剪枝产生的稀疏性可能需要特殊的硬件支持（如支持稀疏矩阵运算的GPU）来充分利用其优势。）

import torch.nn.utils.prune as prune

DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")# 训练模型
# 设置超参数
learning_rate = 0.01
num_epochs = 2
best_acc = 0
model = MobileNetV1(num_classes = 3)
# Define the pruning parameters
pruning_rate = 0.5  # Percentage of weights to prune

# Specify the pruning method (e.g., L1Unstructured, RandomUnstructured, etc.)
pruning_method = prune.L1Unstructured

# Create a pruning instance for each layer you want to prune
pruning_instances = [
#     (model.conv_dw1[0], "weight", pruning_method, pruning_rate),
    (model.conv_dw2[0], "weight", pruning_method, pruning_rate),
    # Add other layers to prune
    (model.conv_dw3[0], "weight", pruning_method, pruning_rate),
    (model.conv_dw4[0], "weight", pruning_method, pruning_rate),
    (model.conv_dw5[0], "weight", pruning_method, pruning_rate),
#     (model.conv_dw6[0], "weight", pruning_method, pruning_rate),
#     (model.conv_d