实例分割 训练yolov5实例分割模型详细记录 rk3588部署yolov5实例分割模型全程记录
本文训练的事yolov5-v7实例分割模型的详细记录,
关于yolov5-v7的官方代码可从官方githu下载
https://github.com/ultralytics/yolov5/tree/v7.0
由于我的实际场景需要部署在rk3588开发板上,所以使用的是瑞芯微官方github的代码
# 克隆master分支
https://github.com/airockchip/yolov5?tab=readme-ov-file
说明:瑞芯微官方的yolov5代码是在yolov5官方(v7.0)代码基础上修改而来,如果想要部署在rk3588等系列开发板上的话,使用瑞芯微官方提供的代码可以直接训练然后导出,不用再进行相关修改,而使用yolo官方提供的代码需要进行相关的修改,两个项目训练流程并无差异。
# 瑞芯微官方相关的model zoo
https://github.com/airockchip/rknn_model_zoo/tree/main/examples/yolov5_seg
数据集标注
制作分割数据集需要利用到labelme软件,我用的版本是3.16.7
虚拟环境中运行下属指令:
# ubutnu
pip install pycocotools
# Windows
pip install pycocotools-windows
# 安装3.16.7
pip install labelme==3.16.7
运行labelme
labelme
得到的json文件格式如下所示:
将标注文件有.json格式转换为.txt格式
import json
import glob
import os
import cv2
import numpy as np
json_path = r"/home/data/project/customer_AAA/rk3588/yolov5-airockchip/Data_save/data_nut_bolt"; #此处填写存放json文件的地址
txt_dir=r"/home/data/project/customer_AAA/rk3588/yolov5-airockchip/Data_save/txts" # 用于存放生成的txt文件的目录
labels = ['bolt','nut']#此处填写你标注的标签名称 除了修改这里,还需要修改下面的for循环中类别相关内容
json_files = glob.glob(json_path + "/*.json")
for json_file in json_files:
print(json_file)
f = open(json_file)
json_info = json.load(f)
# print(json_info.keys())
img = cv2.imread(os.path.join(json_path, json_info["imagePath"]))
height, width, _ = img.shape
np_w_h = np.array([[width, height]], np.int32)
file_name = os.path.basename(json_file)
txt_name = file_name.replace(".json", ".txt")
txt_path=os.path.join(txt_dir,txt_name)
f = open(txt_path, "w")
txt_content = ""
for point_json in json_info["shapes"]:
np_points = np.array(point_json["points"], np.int32)
norm_points = np_points / np_w_h
norm_points_list = norm_points.tolist()
if point_json['label'] == labels[0]:
txt_content += "0 " + " ".join([" ".join([str(cell[0]), str(cell[1])]) for cell in norm_points_list]) + "\n"
elif point_json['label'] == labels[1]:
txt_content += "1 " + " ".join([" ".join([str(cell[0]), str(cell[1])]) for cell in norm_points_list]) + "\n"
f.write(txt_content)
转换完成后,会得到对应于每个图像的**.txt**文件
每一行的数据含义如下所示:
类别标签 归一化后的点坐标
数据集划分:
执行如下代码:
import os
import random
import shutil
rootpath = r'/home/data/project/customer_AAA/rk3588/yolov5-airockchip/Data_save/data_nut_bolt_result/'#此处为img和txt文件夹存放位置,地址后面要有/结尾
set1 = ['images','labels']
set2 = ['train','val']
for s1 in set1:
if not os.path.exists(rootpath+s1):
os.mkdir(rootpath+s1)
for s2 in set2:
if not os.path.exists(rootpath+s1+'/'+s2):
os.mkdir(rootpath+s1+'/'+s2)
# 这是原始图片路径
img_path = rootpath+'img'
# 这是生成的txt路径
txt_path = rootpath+'txt'
file_names = os.listdir(img_path)
l = 0.8
n = len(file_names)
train_files = random.sample(file_names, int(n*l))
for file in file_names:
print(file)
if not os.path.exists(txt_path+'/'+file[:-3]+'txt'):
os.remove(img_path+'/'+file)
print(file[:-3]+'txt,不存在')
continue
if file in train_files:
shutil.copy(img_path+'/'+file,rootpath+'images/train/'+file)
shutil.copy(txt_path+'/'+file[:-3]+'txt',rootpath+'labels/train/'+file[:-3]+'txt')
else:
shutil.copy(img_path+'/'+file,rootpath+'images/val/'+file)
shutil.copy(txt_path+'/'+file[:-3]+'txt',rootpath+'labels/val/'+file[:-3]+'txt')
print('ok!!')
print(len(train_files))
然后再同级目录下新建一个data_nut_bolt_result目录,将仅仅包含图像数据的文件目录(img)与生成的包含.txt文件的目录(txt)放置在该目录下
上述代码中,需要注意rootpath的填写路径,因为路径用到的地方是字符串拼接
执行完毕后会在同级目录下生成对应的目录文件
设置数据集配置文件
在目录yolov5/data下
将coco128-seg.yaml配置文件备份,命名为coco128-seg_nut_bolt.yaml
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# COCO128-seg dataset https://www.kaggle.com/ultralytics/coco128 (first 128 images from COCO train2017) by Ultralytics
# Example usage: python train.py --data coco128.yaml
# parent
# ├── yolov5
# └── datasets
# └── coco128-seg ← downloads here (7 MB)
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: /home/data/project/customer_AAA/rk3588/yolov5-airockchip/Data_seg/data_nut_bolt # dataset root dir
train: images/train # train images (relative to 'path') 128 images
val: images/val # val images (relative to 'path') 128 images
test: # test images (optional)
# Classes
names:
0: bolt
1: nut
由于我的项目是进行螺丝、螺母分割,所以填写的识别类别是两类。
设置模型配置文件
在目录yolov5/models下,拷贝yolov5m-seg.yaml文件重命名为yolov5m-seg.yaml
(python3.8-tk2-2.0) root@f95e42ca3a90:/home/data/project/customer_AAA/rk3588/yolov5-airockchip/models# tree
.
|-- __init__.py
|-- __pycache__
| |-- __init__.cpython-38.pyc
| |-- common.cpython-38.pyc
| |-- common_rk_plug_in.cpython-38.pyc
| |-- experimental.cpython-38.pyc
| `-- yolo.cpython-38.pyc
|-- common.py
|-- common_rk_plug_in.py
|-- experimental.py
|-- hub
| |-- anchors.yaml
| |-- yolov3-spp.yaml
| |-- yolov3-tiny.yaml
| |-- yolov3.yaml
| |-- yolov5-bifpn.yaml
| |-- yolov5-fpn.yaml
| |-- yolov5-p2.yaml
| |-- yolov5-p34.yaml
| |-- yolov5-p6.yaml
| |-- yolov5-p7.yaml
| |-- yolov5-panet.yaml
| |-- yolov5l6.yaml
| |-- yolov5m6.yaml
| |-- yolov5n6.yaml
| |-- yolov5s-LeakyReLU.yaml
| |-- yolov5s-ghost.yaml
| |-- yolov5s-transformer.yaml
| |-- yolov5s6.yaml
| `-- yolov5x6.yaml
|-- segment
| |-- yolov5l-seg.yaml
| |-- yolov5m-seg.yaml
| |-- yolov5n-seg.yaml
| |-- yolov5n-seg_nut_bolt.yaml
| |-- yolov5s-seg.yaml
| `-- yolov5x-seg.yaml
|-- tf.py
|-- yolo.py
|-- yolov5l.yaml
|-- yolov5m.yaml
|-- yolov5n.yaml
|-- yolov5s.yaml
|-- yolov5s_2007_2012.yaml
`-- yolov5x.yaml
配置文件内容如下所示:
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 2 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.25 # layer channel multiple
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# YOLOv5 v6.0 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
# YOLOv5 v6.0 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 23 (P5/32-large)
[[17, 20, 23], 1, Segment, [nc, anchors, 32, 256]], # Detect(P3, P4, P5)
]
开启模型训练
在项目根目录下执行以下指令开启模型训练:
torchrun --nproc-per-node 2 ./segment/train.py --workers 0 --data data/coco128-seg_nut_bolt.yaml --cfg models/segment/yolov5n-seg_nut_bolt.yaml --img 640 --weights yolov5n-seg.pt --batch-size 128 --epochs 100
训练开始,绘制的图像如下所示:
训练的各个阶段曲线如下所示:
模型验证集统计数据如下所示:
Validating runs/train-seg/exp/weights/best.pt...
Fusing layers...
YOLOv5n-seg_nut_bolt summary: 224 layers, 1881103 parameters, 0 gradients, 6.7 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95) Mask(P R mAP50 mAP50-95): 100%|██████████| 1/1 00:03
all 85 604 0.995 0.998 0.994 0.841 0.972 0.975 0.962 0.736
bolt 85 300 0.994 0.997 0.995 0.856 0.947 0.95 0.93 0.641
nut 85 304 0.997 1 0.993 0.826 0.997 1 0.993 0.831
Results saved to runs/train-seg/exp
模型测试
执行如下检测指令
python segment/predict.py --weights runs/train-seg/exp/weights/best.pt --source Data_seg/data_nut_bolt/images/val/6.jpg
检测效果如下所示:
rk3588开发板端部署yolov5分割模型
将训练的模型转换成.rknn格式的模型文件
**注意:**上述模型是基于瑞芯微官方的yolov5(7.0)版本训练得到,所以直接可以进行转换成onnx格式,如果是yolo官方的yolov5代码,则需要改动相关的结构再转换
先转换成onnx格式,执行代码如下:
# 项目根目录下执行,得到onnx文件
python export.py --rknpu --weight runs/train-seg/exp/weights/yolov5n-seg.pt
再转换成.rknn格式文件
依托于瑞芯微github官方项目
https://github.com/airockchip/rknn_model_zoo/tree/v2.0.0
采用官方提供的yolov5-seg示例
其中yolov5-seg项目目录结构如下所示:
(python3.8-tk2-2.0) root@f95e42ca3a90:/home/data/project/customer_AAA/rk3588/rknn_model_zoo-2.0.0/examples/yolov5_seg# tree
.
|-- README.md
|-- cpp
| |-- CMakeLists.txt
| |-- easy_timer.h
| |-- main.cc
| |-- postprocess.h
| |-- rknpu1
| | |-- postprocess.cc
| | `-- yolov5_seg.cc
| |-- rknpu2
| | |-- postprocess.cc
| | `-- yolov5_seg.cc
| `-- yolov5_seg.h
|-- model
| |-- anchors_yolov5.txt
| |-- bus.jpg
| |-- coco_80_labels_list.txt
| `-- download_model.sh
|-- model_comparison
| |-- yolov5_seg_graph_comparison.jpg
| `-- yolov5_seg_output_comparison.jpg
|-- python
| |-- convert.py
| `-- yolov5_seg.py
`-- reference_results
|-- yolov5s_seg_c_demo_result.png
`-- yolov5s_seg_python_demo_result.png
7 directories, 20 files
将上述转换得到的yolov5n-seg.onnx文件拷贝至rknn_model_zoo-2.0.0/examples/yolov5_seg/python目录下:
python convert.py ../model/yolov5s-seg.onnx rk3588
对convert.py脚本进行修改,最终如下所示
import sys
from rknn.api import RKNN
DATASET_PATH = './data_list.txt' # 测试图像路径列表
DEFAULT_RKNN_PATH = './yolov5_seg.rknn'
DEFAULT_QUANT = True
def parse_arg():
if len(sys.argv) < 3:
print("Usage: python3 {} onnx_model_path [platform] [dtype(optional)] [output_rknn_path(optional)]".format(sys.argv[0]));
print(" platform choose from [rk3562, rk3566, rk3568, rk3588, rk1808, rv1109, rv1126]")
print(" dtype choose from [i8, fp] for [rk3562,rk3566,rk3568,rk3588]")
print(" dtype choose from [u8, fp] for [rk1808,rv1109,rv1126]")
exit(1)
model_path = sys.argv[1]
platform = sys.argv[2]
do_quant = DEFAULT_QUANT
if len(sys.argv) > 3:
model_type = sys.argv[3]
if model_type not in ['i8', 'u8', 'fp']:
print("ERROR: Invalid model type: {}".format(model_type))
exit(1)
elif model_type in ['i8', 'u8']:
do_quant = True
else:
do_quant = False
if len(sys.argv) > 4:
output_path = sys.argv[4]
else:
output_path = DEFAULT_RKNN_PATH
return model_path, platform, do_quant, output_path
if __name__ == '__main__':
model_path, platform, do_quant, output_path = parse_arg()
# Create RKNN object
rknn = RKNN(verbose=False)
# Pre-process config
print('--> Config model')
rknn.config(mean_values=[[0, 0, 0]], std_values=[[255, 255, 255]], target_platform=platform)
print('done')
# Load model
print('--> Loading model')
ret = rknn.load_onnx(model=model_path)
if ret != 0:
print('Load model failed!')
exit(ret)
print('done')
# Build model
print('--> Building model')
ret = rknn.build(do_quantization=do_quant, dataset=DATASET_PATH)
if ret != 0:
print('Build model failed!')
exit(ret)
print('done')
# Export rknn model
print('--> Export rknn model')
ret = rknn.export_rknn(output_path)
if ret != 0:
print('Export rknn model failed!')
exit(ret)
print('done')
# Release
rknn.release()
执行如下指令:
python convert.py ./yolov5n-seg.onnx rk3588
测试脚本:
# 加载onnx文件测试
python yolov5_seg.py --model_path yolov5n-seg.onnx --img_folder data_set --img_save
分割效果一般,估计训练100轮次还是不够,哈哈哈
如果服务器与rk3588链接的话,便可在服务器终端执行如下指令,采用开发板npu推理模型。
我的操作是将项目rknn_model_zoo-2.0.0上传到开发板端,模型放置于rknn_model_zoo-2.0.0/examples/yolov5_seg/python目录下
模型文件上传到**rknn_model_zoo-2.0.0/examples/yolov5_seg/model/**目录下,执行python脚本
检测本地目录下的图像:
python yolov5_seg_gqr.py --model_path yolov5s_seg.rknn --img_folder ./test/ --img_save
import os
import cv2
import sys
import argparse
import numpy as np
from pathlib import Path
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
import torch.nn.functional as F
import torch
import torchvision
from rknnlite.api import RKNNLite
from copy import copy
class Letter_Box_Info():
def __init__(self, shape, new_shape, w_ratio, h_ratio, dw, dh, pad_color) -> None:
self.origin_shape = shape
self.new_shape = new_shape
self.w_ratio = w_ratio
self.h_ratio = h_ratio
self.dw = dw
self.dh = dh
self.pad_color = pad_color
class COCO_test_helper():
def __init__(self, enable_letter_box = False) -> None:
self.record_list = []
self.enable_ltter_box = enable_letter_box
if self.enable_ltter_box is True:
self.letter_box_info_list = []
else:
self.letter_box_info_list = None
def letter_box(self, im, new_shape, pad_color=(0,0,0), info_need=False):
# Resize and pad image while meeting stride-multiple constraints
shape = im.shape[:2] # current shape [height, width]
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)
# Scale ratio
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
# Compute padding
ratio = r # width, height ratios
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
dw /= 2 # divide padding into 2 sides
dh /= 2
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=pad_color) # add border
if self.enable_ltter_box is True:
self.letter_box_info_list.append(Letter_Box_Info(shape, new_shape, ratio, ratio, dw, dh, pad_color))
if info_need is True:
return im, ratio, (dw, dh)
else:
return im
def direct_resize(self, im, new_shape, info_need=False):
shape = im.shape[:2]
h_ratio = new_shape[0]/ shape[0]
w_ratio = new_shape[1]/ shape[1]
if self.enable_ltter_box is True:
self.letter_box_info_list.append(Letter_Box_Info(shape, new_shape, w_ratio, h_ratio, 0, 0, (0,0,0)))
im = cv2.resize(im, (new_shape[1], new_shape[0]))
return im
def get_real_box(self, box, in_format='xyxy'):
bbox = copy(box)
if self.enable_ltter_box == True:
# unletter_box result
if in_format=='xyxy':
bbox[:,0] -= self.letter_box_info_list[-1].dw
bbox[:,0] /= self.letter_box_info_list[-1].w_ratio
bbox[:,0] = np.clip(bbox[:,0], 0, self.letter_box_info_list[-1].origin_shape[1])
bbox[:,1] -= self.letter_box_info_list[-1].dh
bbox[:,1] /= self.letter_box_info_list[-1].h_ratio
bbox[:,1] = np.clip(bbox[:,1], 0, self.letter_box_info_list[-1].origin_shape[0])
bbox[:,2] -= self.letter_box_info_list[-1].dw
bbox[:,2] /= self.letter_box_info_list[-1].w_ratio
bbox[:,2] = np.clip(bbox[:,2], 0, self.letter_box_info_list[-1].origin_shape[1])
bbox[:,3] -= self.letter_box_info_list[-1].dh
bbox[:,3] /= self.letter_box_info_list[-1].h_ratio
bbox[:,3] = np.clip(bbox[:,3], 0, self.letter_box_info_list[-1].origin_shape[0])
return bbox
def get_real_seg(self, seg):
#! fix side effect
dh = int(self.letter_box_info_list[-1].dh)
dw = int(self.letter_box_info_list[-1].dw)
origin_shape = self.letter_box_info_list[-1].origin_shape
new_shape = self.letter_box_info_list[-1].new_shape
if (dh == 0) and (dw == 0) and origin_shape == new_shape:
return seg
elif dh == 0 and dw != 0:
seg = seg[:, :, dw:-dw] # a[0:-0] = []
elif dw == 0 and dh != 0 :
seg = seg[:, dh:-dh, :]
seg = np.where(seg, 1, 0).astype(np.uint8).transpose(1,2,0)
seg = cv2.resize(seg, (origin_shape[1], origin_shape[0]), interpolation=cv2.INTER_LINEAR)
if len(seg.shape) < 3:
return seg[None,:,:]
else:
return seg.transpose(2,0,1)
def add_single_record(self, image_id, category_id, bbox, score, in_format='xyxy', pred_masks = None):
if self.enable_ltter_box == True:
# unletter_box result
if in_format=='xyxy':
bbox[0] -= self.letter_box_info_list[-1].dw
bbox[0] /= self.letter_box_info_list[-1].w_ratio
bbox[1] -= self.letter_box_info_list[-1].dh
bbox[1] /= self.letter_box_info_list[-1].h_ratio
bbox[2] -= self.letter_box_info_list[-1].dw
bbox[2] /= self.letter_box_info_list[-1].w_ratio
bbox[3] -= self.letter_box_info_list[-1].dh
bbox[3] /= self.letter_box_info_list[-1].h_ratio
# bbox = [value/self.letter_box_info_list[-1].ratio for value in bbox]
if in_format=='xyxy':
# change xyxy to xywh
bbox[2] = bbox[2] - bbox[0]
bbox[3] = bbox[3] - bbox[1]
else:
assert False, "now only support xyxy format, please add code to support others format"
def single_encode(x):
from pycocotools.mask import encode
rle = encode(np.asarray(x[:, :, None], order="F", dtype="uint8"))[0]
rle["counts"] = rle["counts"].decode("utf-8")
return rle
if pred_masks is None:
self.record_list.append({"image_id": image_id,
"category_id": category_id,
"bbox":[round(x, 3) for x in bbox],
'score': round(score, 5),
})
else:
rles = single_encode(pred_masks)
self.record_list.append({"image_id": image_id,
"category_id": category_id,
"bbox":[round(x, 3) for x in bbox],
'score': round(score, 5),
'segmentation': rles,
})
def export_to_json(self, path):
with open(path, 'w') as f:
json.dump(self.record_list, f)
OBJ_THRESH = 0.25
NMS_THRESH = 0.45
MAX_DETECT = 300
# The follew two param is for mAP test
# OBJ_THRESH = 0.001
# NMS_THRESH = 0.65
IMG_SIZE = (640, 640) # (width, height), such as (1280, 736)
CLASSES = ("bolt","nut")
coco_id_list = [1, 2]
class Colors:
# Ultralytics color palette https://ultralytics.com/
def __init__(self):
# hex = matplotlib.colors.TABLEAU_COLORS.values()
hexs = ('FF3838', 'FF9D97', 'FF701F', 'FFB21D', 'CFD231', '48F90A', '92CC17', '3DDB86', '1A9334', '00D4BB',
'2C99A8', '00C2FF', '344593', '6473FF', '0018EC', '8438FF', '520085', 'CB38FF', 'FF95C8', 'FF37C7')
self.palette = [self.hex2rgb(f'#{c}') for c in hexs]
self.n = len(self.palette)
def __call__(self, i, bgr=False):
c = self.palette[int(i) % self.n]
return (c[2], c[1], c[0]) if bgr else c
@staticmethod
def hex2rgb(h): # rgb order (PIL)
return tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4))
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def filter_boxes(boxes, box_confidences, box_class_probs, seg_part):
"""Filter boxes with object threshold.
"""
box_confidences = box_confidences.reshape(-1)
candidate, class_num = box_class_probs.shape
class_max_score = np.max(box_class_probs, axis=-1)
classes = np.argmax(box_class_probs, axis=-1)
_class_pos = np.where(class_max_score * box_confidences >= OBJ_THRESH)
scores = (class_max_score * box_confidences)[_class_pos]
boxes = boxes[_class_pos]
classes = classes[_class_pos]
seg_part = (seg_part * box_confidences.reshape(-1, 1))[_class_pos]
return boxes, classes, scores, seg_part
def box_process(position, anchors):
grid_h, grid_w = position.shape[2:4]
col, row = np.meshgrid(np.arange(0, grid_w), np.arange(0, grid_h))
col = col.reshape(1, 1, grid_h, grid_w)
row = row.reshape(1, 1, grid_h, grid_w)
grid = np.concatenate((col, row), axis=1)
stride = np.array([IMG_SIZE[1]//grid_h, IMG_SIZE[0]//grid_w]).reshape(1,2,1,1)
col = col.repeat(len(anchors), axis=0)
row = row.repeat(len(anchors), axis=0)
anchors = np.array(anchors)
anchors = anchors.reshape(*anchors.shape, 1, 1)
box_xy = position[:,:2,:,:]*2 - 0.5
box_wh = pow(position[:,2:4,:,:]*2, 2) * anchors
box_xy += grid
box_xy *= stride
box = np.concatenate((box_xy, box_wh), axis=1)
# Convert [c_x, c_y, w, h] to [x1, y1, x2, y2]
xyxy = np.copy(box)
xyxy[:, 0, :, :] = box[:, 0, :, :] - box[:, 2, :, :]/ 2 # top left x
xyxy[:, 1, :, :] = box[:, 1, :, :] - box[:, 3, :, :]/ 2 # top left y
xyxy[:, 2, :, :] = box[:, 0, :, :] + box[:, 2, :, :]/ 2 # bottom right x
xyxy[:, 3, :, :] = box[:, 1, :, :] + box[:, 3, :, :]/ 2 # bottom right y
return xyxy
def post_process(input_data, anchors):
# input_data[0], input_data[2], and input_data[4] are detection box information
# input_data[1], input_data[3], and input_data[5] are segmentation information
# input_data[6] is the proto information
boxes, scores, classes_conf = [], [], []
# 1*255*h*w -> 3*85*h*w
detect_part = [input_data[i*2].reshape([len(anchors[0]), -1]+list(input_data[i*2].shape[-2:])) for i in range(len(anchors))]
seg_part = [input_data[i*2+1].reshape([len(anchors[0]), -1]+list(input_data[i*2+1].shape[-2:])) for i in range(len(anchors))]
proto = input_data[-1]
for i in range(len(detect_part)):
boxes.append(box_process(detect_part[i][:, :4, :, :], anchors[i]))
scores.append(detect_part[i][:, 4:5, :, :])
classes_conf.append(detect_part[i][:, 5:, :, :])
def sp_flatten(_in):
ch = _in.shape[1]
_in = _in.transpose(0, 2, 3, 1)
return _in.reshape(-1, ch)
boxes = [sp_flatten(_v) for _v in boxes]
classes_conf = [sp_flatten(_v) for _v in classes_conf]
scores = [sp_flatten(_v) for _v in scores]
seg_part = [sp_flatten(_v) for _v in seg_part]
boxes = np.concatenate(boxes)
classes_conf = np.concatenate(classes_conf)
scores = np.concatenate(scores)
seg_part = np.concatenate(seg_part)
# filter according to threshold
boxes, classes, scores, seg_part = filter_boxes(boxes, scores, classes_conf, seg_part)
zipped = zip(boxes, classes, scores, seg_part)
sort_zipped = sorted(zipped, key=lambda x: (x[2]), reverse=True)
result = zip(*sort_zipped)
max_nms = 30000
n = boxes.shape[0] # number of boxes
if not n:
return None, None, None, None
elif n > max_nms: # excess boxes
boxes, classes, scores, seg_part = [np.array(x[:max_nms]) for x in result]
else:
boxes, classes, scores, seg_part = [np.array(x) for x in result]
# nms
nboxes, nclasses, nscores, nseg_part = [], [], [], []
agnostic = 0
max_wh = 7680
c = classes * (0 if agnostic else max_wh)
ids = torchvision.ops.nms(torch.tensor(boxes, dtype=torch.float32) + torch.tensor(c, dtype=torch.float32).unsqueeze(-1),
torch.tensor(scores, dtype=torch.float32), NMS_THRESH)
real_keeps = ids.tolist()[:MAX_DETECT]
nboxes.append(boxes[real_keeps])
nclasses.append(classes[real_keeps])
nscores.append(scores[real_keeps])
nseg_part.append(seg_part[real_keeps])
if not nclasses and not nscores:
return None, None, None, None
boxes = np.concatenate(nboxes)
classes = np.concatenate(nclasses)
scores = np.concatenate(nscores)
seg_part = np.concatenate(nseg_part)
ph, pw = proto.shape[-2:]
proto = proto.reshape(seg_part.shape[-1], -1)
seg_img = np.matmul(seg_part, proto)
seg_img = sigmoid(seg_img)
seg_img = seg_img.reshape(-1, ph, pw)
seg_threadhold = 0.5
# crop seg outside box
seg_img = F.interpolate(torch.tensor(seg_img)[None], torch.Size([640, 640]), mode='bilinear', align_corners=False)[0]
seg_img_t = _crop_mask(seg_img,torch.tensor(boxes) )
seg_img = seg_img_t.numpy()
seg_img = seg_img > seg_threadhold
return boxes, classes, scores, seg_img
def draw(image, boxes, scores, classes):
for box, score, cl in zip(boxes, scores, classes):
top, left, right, bottom = [int(_b) for _b in box]
print("%s @ (%d %d %d %d) %.3f" % (CLASSES[cl], top, left, right, bottom, score))
cv2.rectangle(image, (top, left), (right, bottom), (255, 0, 0), 2)
cv2.putText(image, '{0} {1:.2f}'.format(CLASSES[cl], score),
(top, left - 6), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2)
def _crop_mask(masks, boxes):
"""
"Crop" predicted masks by zeroing out everything not in the predicted bbox.
Vectorized by Chong (thanks Chong).
Args:
- masks should be a size [h, w, n] tensor of masks
- boxes should be a size [n, 4] tensor of bbox coords in relative point form
"""
n, h, w = masks.shape
x1, y1, x2, y2 = torch.chunk(boxes[:, :, None], 4, 1) # x1 shape(1,1,n)
r = torch.arange(w, device=masks.device, dtype=x1.dtype)[None, None, :] # rows shape(1,w,1)
c = torch.arange(h, device=masks.device, dtype=x1.dtype)[None, :, None] # cols shape(h,1,1)
return masks * ((r >= x1) * (r < x2) * (c >= y1) * (c < y2))
def merge_seg(image, seg_img, classes):
color = Colors()
for i in range(len(seg_img)):
seg = seg_img[i]
seg = seg.astype(np.uint8)
seg = cv2.cvtColor(seg, cv2.COLOR_GRAY2BGR)
seg = seg * color(classes[i])
seg = seg.astype(np.uint8)
image = cv2.add(image, seg)
return image
def setup_model(args):
model_path = args.model_path
# 创建 RKNNLite 对象
rknn = RKNNLite() # <-- 确保缩进正确(4个空格或 1个 Tab)
ret = rknn.load_rknn(model_path) # <-- 使用 args.model_path 而不是 RKNN_MODEL
if ret != 0:
print('Load RKNN model failed')
exit(ret)
print('done')
ret = rknn.init_runtime()
if ret != 0:
print('Init runtime environment failed!')
exit(ret)
print('done')
return rknn, args.target # <-- 返回 rknn 和 args.target
def img_check(path):
img_type = ['.jpg', '.jpeg', '.png', '.bmp']
for _type in img_type:
if path.endswith(_type) or path.endswith(_type.upper()):
return True
return False
def letterbox(im, new_shape=(640, 640), color=(0, 0, 0)):
# Resize and pad image while meeting stride-multiple constraints
shape = im.shape[:2] # current shape [height, width]
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)
# Scale ratio (new / old)
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
# Compute padding
ratio = r, r # width, height ratios
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
dw /= 2 # divide padding into 2 sides
dh /= 2
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
return im, ratio, (dw, dh)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Process some integers.')
# basic params
parser.add_argument('--model_path', type=str, required= True, help='model path, could be .onnx or .rknn file')
parser.add_argument('--target', type=str, default='rk3566', help='target RKNPU platform')
parser.add_argument('--device_id', type=str, default=None, help='device id')
parser.add_argument('--img_show', action='store_true', default=False, help='draw the result and show')
parser.add_argument('--img_save', action='store_true', default=False, help='save the result')
# data params
parser.add_argument('--anno_json', type=str, default='../../../datasets/COCO/annotations/instances_val2017.json', help='coco annotation path')
# coco val folder: '../../../datasets/COCO//val2017'
parser.add_argument('--img_folder', type=str, default='../model', help='img folder path')
parser.add_argument('--coco_map_test', action='store_true', help='enable coco mAP test')
# This anchors_yolov5.txt is defined in the configuration file in the yolov5 official project.
# For example, one of the configuration file paths is <yolov5_root_path>/models/yolov5n.yaml.
# If you modify the anchors configuration when training the model, you need modify it to the corresponding value in this file.
parser.add_argument('--anchors', type=str, default='../model/anchors_yolov5.txt', help='target to anchor file')
args = parser.parse_args()
# load anchor
with open(args.anchors, 'r') as f:
values = [float(_v) for _v in f.readlines()]
anchors = np.array(values).reshape(3,-1,2).tolist()
print("use anchors from '{}', which is {}".format(args.anchors, anchors))
# init model
model_path = args.model_path
# 创建 RKNNLite 对象
rknn = RKNNLite() # <-- 确保缩进正确(4个空格或 1个 Tab)
ret = rknn.load_rknn(model_path) # <-- 使用 args.model_path 而不是 RKNN_MODEL
if ret != 0:
print('Load RKNN model failed')
exit(ret)
print('done')
ret = rknn.init_runtime()
if ret != 0:
print('Init runtime environment failed!')
exit(ret)
print('done')
file_list = sorted(os.listdir(args.img_folder))
img_list = []
for path in file_list:
if img_check(path):
img_list.append(path)
co_helper = COCO_test_helper(enable_letter_box=True)
# run test
for i in range(len(img_list)):
print('infer {}/{}'.format(i+1, len(img_list)), end='\r')
img_name = img_list[i]
img_path = os.path.join(args.img_folder, img_name)
if not os.path.exists(img_path):
print("{} is not found", img_name)
continue
img_src = cv2.imread(img_path)
if img_src is None:
continue
img = co_helper.letter_box(img_src.copy(), new_shape=(640, 640), info_need=False)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img2 = np.expand_dims(img, 0)
outputs = rknn.inference(inputs=[img2], data_format=['nhwc'])
boxes, classes, scores, seg_img = post_process(outputs, anchors)
if boxes is not None:
real_boxs = co_helper.get_real_box(boxes)
real_segs = co_helper.get_real_seg(seg_img)
img_p = merge_seg(img_src, real_segs, classes)
if args.img_show or args.img_save:
print('\n\nIMG: {}'.format(img_name))
draw(img_p, real_boxs, scores, classes)
if args.img_save:
if not os.path.exists('./result'):
os.mkdir('./result')
result_path = os.path.join('./result', img_name)
cv2.imwrite(result_path, img_p)
print('Detection result save to {}'.format(result_path))
if args.img_show:
cv2.imshow("full post process result", img_p)
cv2.waitKeyEx(0)
为了节省时间,仅仅训练的100轮次:
调取本地摄像头进行实时检测代码如下所示:
运行指令:
python yolov5_seg_gqr_cam.py --model_path ./yolov5_seg.rknn
import os
import cv2
import sys
import argparse
import numpy as np
from pathlib import Path
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
import torch.nn.functional as F
import torch
import torchvision
from rknnlite.api import RKNNLite
from copy import copy
class Letter_Box_Info():
def __init__(self, shape, new_shape, w_ratio, h_ratio, dw, dh, pad_color) -> None:
self.origin_shape = shape
self.new_shape = new_shape
self.w_ratio = w_ratio
self.h_ratio = h_ratio
self.dw = dw
self.dh = dh
self.pad_color = pad_color
class COCO_test_helper():
def __init__(self, enable_letter_box = False) -> None:
self.record_list = []
self.enable_ltter_box = enable_letter_box
if self.enable_ltter_box is True:
self.letter_box_info_list = []
else:
self.letter_box_info_list = None
def letter_box(self, im, new_shape, pad_color=(0,0,0), info_need=False):
# Resize and pad image while meeting stride-multiple constraints
shape = im.shape[:2] # current shape [height, width]
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)
# Scale ratio
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
# Compute padding
ratio = r # width, height ratios
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
dw /= 2 # divide padding into 2 sides
dh /= 2
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=pad_color) # add border
if self.enable_ltter_box is True:
self.letter_box_info_list.append(Letter_Box_Info(shape, new_shape, ratio, ratio, dw, dh, pad_color))
if info_need is True:
return im, ratio, (dw, dh)
else:
return im
def direct_resize(self, im, new_shape, info_need=False):
shape = im.shape[:2]
h_ratio = new_shape[0]/ shape[0]
w_ratio = new_shape[1]/ shape[1]
if self.enable_ltter_box is True:
self.letter_box_info_list.append(Letter_Box_Info(shape, new_shape, w_ratio, h_ratio, 0, 0, (0,0,0)))
im = cv2.resize(im, (new_shape[1], new_shape[0]))
return im
def get_real_box(self, box, in_format='xyxy'):
bbox = copy(box)
if self.enable_ltter_box == True:
# unletter_box result
if in_format=='xyxy':
bbox[:,0] -= self.letter_box_info_list[-1].dw
bbox[:,0] /= self.letter_box_info_list[-1].w_ratio
bbox[:,0] = np.clip(bbox[:,0], 0, self.letter_box_info_list[-1].origin_shape[1])
bbox[:,1] -= self.letter_box_info_list[-1].dh
bbox[:,1] /= self.letter_box_info_list[-1].h_ratio
bbox[:,1] = np.clip(bbox[:,1], 0, self.letter_box_info_list[-1].origin_shape[0])
bbox[:,2] -= self.letter_box_info_list[-1].dw
bbox[:,2] /= self.letter_box_info_list[-1].w_ratio
bbox[:,2] = np.clip(bbox[:,2], 0, self.letter_box_info_list[-1].origin_shape[1])
bbox[:,3] -= self.letter_box_info_list[-1].dh
bbox[:,3] /= self.letter_box_info_list[-1].h_ratio
bbox[:,3] = np.clip(bbox[:,3], 0, self.letter_box_info_list[-1].origin_shape[0])
return bbox
def get_real_seg(self, seg):
#! fix side effect
dh = int(self.letter_box_info_list[-1].dh)
dw = int(self.letter_box_info_list[-1].dw)
origin_shape = self.letter_box_info_list[-1].origin_shape
new_shape = self.letter_box_info_list[-1].new_shape
if (dh == 0) and (dw == 0) and origin_shape == new_shape:
return seg
elif dh == 0 and dw != 0:
seg = seg[:, :, dw:-dw] # a[0:-0] = []
elif dw == 0 and dh != 0 :
seg = seg[:, dh:-dh, :]
seg = np.where(seg, 1, 0).astype(np.uint8).transpose(1,2,0)
seg = cv2.resize(seg, (origin_shape[1], origin_shape[0]), interpolation=cv2.INTER_LINEAR)
if len(seg.shape) < 3:
return seg[None,:,:]
else:
return seg.transpose(2,0,1)
def add_single_record(self, image_id, category_id, bbox, score, in_format='xyxy', pred_masks = None):
if self.enable_ltter_box == True:
# unletter_box result
if in_format=='xyxy':
bbox[0] -= self.letter_box_info_list[-1].dw
bbox[0] /= self.letter_box_info_list[-1].w_ratio
bbox[1] -= self.letter_box_info_list[-1].dh
bbox[1] /= self.letter_box_info_list[-1].h_ratio
bbox[2] -= self.letter_box_info_list[-1].dw
bbox[2] /= self.letter_box_info_list[-1].w_ratio
bbox[3] -= self.letter_box_info_list[-1].dh
bbox[3] /= self.letter_box_info_list[-1].h_ratio
# bbox = [value/self.letter_box_info_list[-1].ratio for value in bbox]
if in_format=='xyxy':
# change xyxy to xywh
bbox[2] = bbox[2] - bbox[0]
bbox[3] = bbox[3] - bbox[1]
else:
assert False, "now only support xyxy format, please add code to support others format"
def single_encode(x):
from pycocotools.mask import encode
rle = encode(np.asarray(x[:, :, None], order="F", dtype="uint8"))[0]
rle["counts"] = rle["counts"].decode("utf-8")
return rle
if pred_masks is None:
self.record_list.append({"image_id": image_id,
"category_id": category_id,
"bbox":[round(x, 3) for x in bbox],
'score': round(score, 5),
})
else:
rles = single_encode(pred_masks)
self.record_list.append({"image_id": image_id,
"category_id": category_id,
"bbox":[round(x, 3) for x in bbox],
'score': round(score, 5),
'segmentation': rles,
})
def export_to_json(self, path):
with open(path, 'w') as f:
json.dump(self.record_list, f)
OBJ_THRESH = 0.60
NMS_THRESH = 0.45
MAX_DETECT = 300
# The follew two param is for mAP test
# OBJ_THRESH = 0.001
# NMS_THRESH = 0.65
IMG_SIZE = (640, 640) # (width, height), such as (1280, 736)
CLASSES = ("bolt","nut")
coco_id_list = [1, 2]
class Colors:
# Ultralytics color palette https://ultralytics.com/
def __init__(self):
# hex = matplotlib.colors.TABLEAU_COLORS.values()
hexs = ('FF3838', 'FF9D97', 'FF701F', 'FFB21D', 'CFD231', '48F90A', '92CC17', '3DDB86', '1A9334', '00D4BB',
'2C99A8', '00C2FF', '344593', '6473FF', '0018EC', '8438FF', '520085', 'CB38FF', 'FF95C8', 'FF37C7')
self.palette = [self.hex2rgb(f'#{c}') for c in hexs]
self.n = len(self.palette)
def __call__(self, i, bgr=False):
c = self.palette[int(i) % self.n]
return (c[2], c[1], c[0]) if bgr else c
@staticmethod
def hex2rgb(h): # rgb order (PIL)
return tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4))
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def filter_boxes(boxes, box_confidences, box_class_probs, seg_part):
"""Filter boxes with object threshold.
"""
box_confidences = box_confidences.reshape(-1)
candidate, class_num = box_class_probs.shape
class_max_score = np.max(box_class_probs, axis=-1)
classes = np.argmax(box_class_probs, axis=-1)
_class_pos = np.where(class_max_score * box_confidences >= OBJ_THRESH)
scores = (class_max_score * box_confidences)[_class_pos]
boxes = boxes[_class_pos]
classes = classes[_class_pos]
seg_part = (seg_part * box_confidences.reshape(-1, 1))[_class_pos]
return boxes, classes, scores, seg_part
def box_process(position, anchors):
grid_h, grid_w = position.shape[2:4]
col, row = np.meshgrid(np.arange(0, grid_w), np.arange(0, grid_h))
col = col.reshape(1, 1, grid_h, grid_w)
row = row.reshape(1, 1, grid_h, grid_w)
grid = np.concatenate((col, row), axis=1)
stride = np.array([IMG_SIZE[1]//grid_h, IMG_SIZE[0]//grid_w]).reshape(1,2,1,1)
col = col.repeat(len(anchors), axis=0)
row = row.repeat(len(anchors), axis=0)
anchors = np.array(anchors)
anchors = anchors.reshape(*anchors.shape, 1, 1)
box_xy = position[:,:2,:,:]*2 - 0.5
box_wh = pow(position[:,2:4,:,:]*2, 2) * anchors
box_xy += grid
box_xy *= stride
box = np.concatenate((box_xy, box_wh), axis=1)
# Convert [c_x, c_y, w, h] to [x1, y1, x2, y2]
xyxy = np.copy(box)
xyxy[:, 0, :, :] = box[:, 0, :, :] - box[:, 2, :, :]/ 2 # top left x
xyxy[:, 1, :, :] = box[:, 1, :, :] - box[:, 3, :, :]/ 2 # top left y
xyxy[:, 2, :, :] = box[:, 0, :, :] + box[:, 2, :, :]/ 2 # bottom right x
xyxy[:, 3, :, :] = box[:, 1, :, :] + box[:, 3, :, :]/ 2 # bottom right y
return xyxy
def post_process(input_data, anchors):
# input_data[0], input_data[2], and input_data[4] are detection box information
# input_data[1], input_data[3], and input_data[5] are segmentation information
# input_data[6] is the proto information
boxes, scores, classes_conf = [], [], []
# 1*255*h*w -> 3*85*h*w
detect_part = [input_data[i*2].reshape([len(anchors[0]), -1]+list(input_data[i*2].shape[-2:])) for i in range(len(anchors))]
seg_part = [input_data[i*2+1].reshape([len(anchors[0]), -1]+list(input_data[i*2+1].shape[-2:])) for i in range(len(anchors))]
proto = input_data[-1]
for i in range(len(detect_part)):
boxes.append(box_process(detect_part[i][:, :4, :, :], anchors[i]))
scores.append(detect_part[i][:, 4:5, :, :])
classes_conf.append(detect_part[i][:, 5:, :, :])
def sp_flatten(_in):
ch = _in.shape[1]
_in = _in.transpose(0, 2, 3, 1)
return _in.reshape(-1, ch)
boxes = [sp_flatten(_v) for _v in boxes]
classes_conf = [sp_flatten(_v) for _v in classes_conf]
scores = [sp_flatten(_v) for _v in scores]
seg_part = [sp_flatten(_v) for _v in seg_part]
boxes = np.concatenate(boxes)
classes_conf = np.concatenate(classes_conf)
scores = np.concatenate(scores)
seg_part = np.concatenate(seg_part)
# filter according to threshold
boxes, classes, scores, seg_part = filter_boxes(boxes, scores, classes_conf, seg_part)
zipped = zip(boxes, classes, scores, seg_part)
sort_zipped = sorted(zipped, key=lambda x: (x[2]), reverse=True)
result = zip(*sort_zipped)
max_nms = 30000
n = boxes.shape[0] # number of boxes
if not n:
return None, None, None, None
elif n > max_nms: # excess boxes
boxes, classes, scores, seg_part = [np.array(x[:max_nms]) for x in result]
else:
boxes, classes, scores, seg_part = [np.array(x) for x in result]
# nms
nboxes, nclasses, nscores, nseg_part = [], [], [], []
agnostic = 0
max_wh = 7680
c = classes * (0 if agnostic else max_wh)
ids = torchvision.ops.nms(torch.tensor(boxes, dtype=torch.float32) + torch.tensor(c, dtype=torch.float32).unsqueeze(-1),
torch.tensor(scores, dtype=torch.float32), NMS_THRESH)
real_keeps = ids.tolist()[:MAX_DETECT]
nboxes.append(boxes[real_keeps])
nclasses.append(classes[real_keeps])
nscores.append(scores[real_keeps])
nseg_part.append(seg_part[real_keeps])
if not nclasses and not nscores:
return None, None, None, None
boxes = np.concatenate(nboxes)
classes = np.concatenate(nclasses)
scores = np.concatenate(nscores)
seg_part = np.concatenate(nseg_part)
ph, pw = proto.shape[-2:]
proto = proto.reshape(seg_part.shape[-1], -1)
seg_img = np.matmul(seg_part, proto)
seg_img = sigmoid(seg_img)
seg_img = seg_img.reshape(-1, ph, pw)
seg_threadhold = 0.5
# crop seg outside box
seg_img = F.interpolate(torch.tensor(seg_img)[None], torch.Size([640, 640]), mode='bilinear', align_corners=False)[0]
seg_img_t = _crop_mask(seg_img,torch.tensor(boxes) )
seg_img = seg_img_t.numpy()
seg_img = seg_img > seg_threadhold
return boxes, classes, scores, seg_img
def draw(image, boxes, scores, classes):
for box, score, cl in zip(boxes, scores, classes):
top, left, right, bottom = [int(_b) for _b in box]
print("%s @ (%d %d %d %d) %.3f" % (CLASSES[cl], top, left, right, bottom, score))
cv2.rectangle(image, (top, left), (right, bottom), (255, 0, 0), 2)
cv2.putText(image, '{0} {1:.2f}'.format(CLASSES[cl], score),
(top, left - 6), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2)
def _crop_mask(masks, boxes):
"""
"Crop" predicted masks by zeroing out everything not in the predicted bbox.
Vectorized by Chong (thanks Chong).
Args:
- masks should be a size [h, w, n] tensor of masks
- boxes should be a size [n, 4] tensor of bbox coords in relative point form
"""
n, h, w = masks.shape
x1, y1, x2, y2 = torch.chunk(boxes[:, :, None], 4, 1) # x1 shape(1,1,n)
r = torch.arange(w, device=masks.device, dtype=x1.dtype)[None, None, :] # rows shape(1,w,1)
c = torch.arange(h, device=masks.device, dtype=x1.dtype)[None, :, None] # cols shape(h,1,1)
return masks * ((r >= x1) * (r < x2) * (c >= y1) * (c < y2))
def merge_seg(image, seg_img, classes):
color = Colors()
for i in range(len(seg_img)):
seg = seg_img[i]
seg = seg.astype(np.uint8)
seg = cv2.cvtColor(seg, cv2.COLOR_GRAY2BGR)
seg = seg * color(classes[i])
seg = seg.astype(np.uint8)
image = cv2.add(image, seg)
return image
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Process some integers.')
# basic params
parser.add_argument('--model_path', type=str, required= True, help='model path, could be .onnx or .rknn file')
parser.add_argument('--target', type=str, default='rk3566', help='target RKNPU platform')
# This anchors_yolov5.txt is defined in the configuration file in the yolov5 official project.
# For example, one of the configuration file paths is <yolov5_root_path>/models/yolov5n.yaml.
# If you modify the anchors configuration when training the model, you need modify it to the corresponding value in this file.
parser.add_argument('--anchors', type=str, default='../model/anchors_yolov5.txt', help='target to anchor file')
args = parser.parse_args()
# load anchor
with open(args.anchors, 'r') as f:
values = [float(_v) for _v in f.readlines()]
anchors = np.array(values).reshape(3,-1,2).tolist()
print("use anchors from '{}', which is {}".format(args.anchors, anchors))
# init model
model_path = args.model_path
# 创建 RKNNLite 对象
rknn = RKNNLite() # <-- 确保缩进正确(4个空格或 1个 Tab)
ret = rknn.load_rknn(model_path) # <-- 使用 args.model_path 而不是 RKNN_MODEL
if ret != 0:
print('Load RKNN model failed')
exit(ret)
print('done')
ret = rknn.init_runtime()
if ret != 0:
print('Init runtime environment failed!')
exit(ret)
print('done')
co_helper = COCO_test_helper(enable_letter_box=True)
cap = cv2.VideoCapture('/dev/video20', cv2.CAP_V4L2)
# 手动设置分辨率
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 640)
if not cap.isOpened():
print("打开摄像头失败")
exit()
print("按 q 退出")
import time
# 计算FPS
frame_count=0
fps=0
start_time=time.time()
while True:
ret, img_src = cap.read()
if not ret:
print("读取帧失败")
break
# 计算FPS
frame_count+=1
if frame_count >=10:
end_time=time.time()
fps=frame_count / (end_time-start_time)
frame_count=0
start_time=end_time
img = co_helper.letter_box(img_src.copy(), new_shape=(640, 640), info_need=False)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img2 = np.expand_dims(img, 0)
outputs = rknn.inference(inputs=[img2], data_format=['nhwc'])
boxes, classes, scores, seg_img = post_process(outputs, anchors)
img_p=img_src
if boxes is not None:
real_boxs = co_helper.get_real_box(boxes)
real_segs = co_helper.get_real_seg(seg_img)
img_p = merge_seg(img_src, real_segs, classes)
draw(img_p, real_boxs, scores, classes)
# 绘制FPS
cv2.putText(img_p, f"FPS: {fps:.2f}",
(img_p.shape[1] - 150, 30),
cv2.FONT_HERSHEY_SIMPLEX,
0.7, (0, 255, 0), 2)
#img_1=cv2.resize(img_1,(800,600))
cv2.imshow("My Detection Window", img_p)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
rknn.release()