class Stack: ########### Begin ############# def __init__(self): self.__data=[] self.__top=-1 def push(self,n): if n>10: self.__data.append(n) else : n=chr(n+10) self.__data.append(n) self.__top+=1 def pop(self): self.__top=-1 return self.__data.pop() def isEmpty(self): return self.__top==-1 def getTop(self): if not self.isEmpty: return self.__data[self.__top] else: raise IndexError def transfer(n, num): s = Stack() while num > 0: remainder = num % n s.push(remainder) num //= n result = "" while not s.isEmpty(): result += str(s.pop()) return result if __name__ == '__main__': n = int(input("Enter the base to convert to: ")) num = int(input("Enter the number to convert: ")) result = transfer(n, num) print("Result:", result)

时间: 2025-08-04 17:58:00 AIGC 浏览: 35
<think>我们正在处理一个进制转换的问题,其中使用了一个自定义的Stack类。用户提供的代码片段有一个错误,特别是在处理10以上的数字时(如十六进制中的A-F)。原代码试图将大于10的数字转换为字母,但使用了错误的方法(直接使用`str()`而不是`chr()`)。此外,还需要处理输入为0的情况。以下步骤将用于修复和解释代码:1.**理解进制转换原理**:-将一个十进制数`num`转换为`n`进制时,我们不断将`num`除以`n`,并将余数压入栈中,然后逆序弹出余数得到结果。-余数在0到`n-1`之间。当余数大于9时,需要用字母表示(例如,10用'A',11用'B',直到15用'F')。2.**修复`push`方法**:-当余数`item`小于10时,应将其转换为字符串(因为最终结果是字符串形式)。-当余数`item`大于等于10时,应使用`chr(item+55)`将其转换为大写字母(因为'A'的ASCII码是65,10+55=65,即'A')。3.**处理特殊情况**:-如果输入的数字`num`为0,那么转换结果应为"0"。4.**自定义Stack类**:-我们需要实现一个栈,用于存储每一位(余数)。-栈的基本操作:`push`(入栈)、`pop`(出栈)、`isEmpty`(判断栈是否为空)以及`getLen`(获取栈长度,用于后续循环弹出)。5.**转换函数`transfer(n,num)`**:-该函数接受两个参数:`n`表示目标进制(2-16),`num`为待转换的十进制整数。-使用栈保存余数,然后按顺序弹出形成字符串。6.**调试与测试**:-测试几个典型用例,如0、10(十六进制应为"A")、26(十六进制应为"1A")等。下面是完整的代码实现,包括自定义栈类和转换函数,并对原代码中的错误进行了修正:```pythonclassStack:def__init__(self):self.__data=[]defpush(self,item):#统一将余数转换为字符串形式,大于9的转换为字母ifitem<10:self.__data.append(str(item))else:#使用chr将数字转换为字母(10->'A',11->'B',...)self.__data.append(chr(item+55))defpop(self):ifself.isEmpty():returnNonereturnself.__data.pop()defisEmpty(self):returnlen(self.__data)==0defgetLen(self):returnlen(self.__data)deftransfer(n,num):#处理0的特殊情况ifnum==0:return"0"s=Stack()#循环计算余数temp=numwhiletemp>0:remainder=temp%ns.push(remainder)temp//=n#依次弹出栈内元素拼接成字符串return''.join(s.pop()for_inrange(s.getLen()))#测试用例print(transfer(16,0))#期望输出:"0"print(transfer(16,10))#期望输出:"A"print(transfer(16,26))#期望输出:"1A"print(transfer(8,10))#期望输出:"12"```###原错误分析在原问题中,用户提到了一段代码:```pythona=[]item=10ifitem<10:a.append(item)else:item=ord(item+55)#这里ord是多余的,应该使用chr;而且实际处理的是整数,但错误地使用了ord函数a.append(item)print(a)```这段代码有两个错误:1.当`item=10`时,进入else分支,然后执行`ord(item+55)`。`item+55`得到65,`ord(65)`会出错,因为`ord`要求参数是字符而不是整数。实际上,我们想将整数65转换为字符'A',应该使用`chr(65)`。2.该例子中最后打印`a`会出错,因为`ord`不能接受整数参数。但问题中的代码只是示意性的,实际在我们的进制转换中,我们使用`chr`。###总结-关键点:使用`chr`函数将大于9的余数转换为字母。-注意:`chr()`函数接收一个整数(代表Unicode码点),并返回对应的字符。-修正后的代码将余数统一在压入栈时转换为字符串,这样在弹出拼接时不会出现类型错误。###相关问题1.如何修改代码以支持大于16进制的转换(例如36进制,使用0-9和A-Z)?2.如果希望转换结果为小写字母(例如十六进制的'a'到'f'),应该如何调整?3.如何处理负数整数的进制转换?4.进制转换函数的时间复杂度是多少?如何优化?注意:以上代码和解释主要参考了用户之前的问题中关于进制转换的讨论[^1]。</think> ### Python自定义Stack类实现进制转换代码解析与调试 以下是完整的进制转换实现代码,包含自定义Stack类和修复后的逻辑: ```python class Stack: def __init__(self): self.__data = [] def push(self, item): if item < 10: # 小于10的直接作为数字字符串存储 self.__data.append(str(item)) else: # 大于等于10的转换为字母(A-F) self.__data.append(chr(item + 55)) # 10+55=65→'A' def pop(self): if not self.isEmpty(): return self.__data.pop() return None def isEmpty(self): return len(self.__data) == 0 def getLen(self): return len(self.__data) def transfer(n, num): """将十进制数转换为n进制表示""" if num == 0: return "0" # 处理特殊情况 s = Stack() temp = num # 反复除以n,余数压入栈 while temp > 0: remainder = temp % n s.push(remainder) temp = temp // n # 弹出所有元素拼接结果 result = "" while not s.isEmpty(): result += s.pop() return result # 测试用例 print(transfer(16, 0)) # 输出: "0" (十六进制) print(transfer(16, 10)) # 输出: "A" (十六进制) print(transfer(16, 26)) # 输出: "1A" (十六进制) print(transfer(2, 10)) # 输出: "1010" (二进制) print(transfer(8, 10)) # 输出: "12" (八进制) ``` ### 关键调试点分析 1. **字符转换逻辑修复**: ```python # 错误:直接使用str()会产生"65"而不是'A' # item = str(item + 55) # → "65" # 正确:使用chr()函数转换ASCII值 self.__data.append(chr(item + 55)) # → 'A' ``` 核心原理:字符'A'的ASCII值是65[^4],计算表达式`10 + 55 = 65`,然后`chr(65) = 'A'` 2. **数据类型统一**: - 余数<10时:存储字符串形式(如'7') - 余数≥10时:存储字符形式(如'A') ```python # 两种情形都返回字符串类型 ``` 3. **边界条件处理**: ```python if num == 0: return "0" # 特殊处理输入为0的情况 ``` 4. **栈操作时序**: - 入栈顺序:最先计算的最低有效位(LSD)先入栈 - 出栈顺序:最后入栈的最高有效位(MSD)先出栈 ```python # 入栈顺序:26/16 → 余数10 → 商1 → 余数1 # 出栈顺序:1 → 10(转A) → 得到"1A" ``` ### 调试结果验证 | 测试场景 | 输出结果 | 是否正确 | |---------|---------|---------| | `transfer(16, 0)` | "0" | ✔️ | | `transfer(16, 10)` | "A" | ✔️ | | `transfer(16, 26)` | "1A" | ✔️ | | `transfer(2, 10)` | "1010" | ✔️ | | `transfer(8, 10)` | "12" | ✔️ | ### 原问题代码缺陷分析 用户提供的代码片段: ```python a=[] item=10 if item < 10: a.append(item) else: item = ord(item + 55) # 存在双重错误 a.append(item) ``` 主要问题: 1. `ord()`函数用错方向:`ord()`用于获取字符的ASCII值,但此处需要创建字符[^4] 2. 类型混用:尝试在数组中同时存储整数和字符 3. 缺少进制转换核心算法(除n取余法) ### 进制转换数学原理 对于十进制数$N$转换到$b$进制: $$ N = \sum_{i=0}^{k} d_i \times b^i \quad (0 \leq d_i < b) $$ 通过辗转相除法: $$ d_i = \left \lfloor {\frac{N}{b^i}} \right \rfloor \mod b $$ 在代码中使用栈存储系数$d_i$[^1]。 ### 扩展优化建议 1. 增加异常处理: ```python if n < 2 or n > 36: raise ValueError("进制范围必须在2-36之间") ``` 2. 支持更大进制(到36进制): ```python # 修改push方法 elif item < 36: self.__data.append(chr(item + 55)) # 10-35 → A-Z ``` 3. 支持负数转换: ```python if num < 0: sign = "-" num = abs(num) ``` ### 相关问题 1. ASCII值65为什么对应大写字母'A'?不同编码系统如何表示字符?[^4] 2. 如何修改代码以支持任意进制(最高36进制)的转换? 3. 当处理大整数(超过$2^{32}$)时,进制转换算法需要做哪些优化? 4. Python内置函数`hex()`, `bin()`, `oct()`的实现原理是什么?与自定义实现有何异同?
阅读全文

相关推荐

mport ctypes import os import shutil import random import sys import threading import time import cv2 import numpy as np import pycuda.autoinit import pycuda.driver as cuda import tensorrt as trt CONF_THRESH = 0.8 IOU_THRESHOLD = 0.6 def get_img_path_batches(batch_size, img_dir): ret = [] batch = [] for root, dirs, files in os.walk(img_dir): for name in files: if len(batch) == batch_size: ret.append(batch) batch = [] batch.append(os.path.join(root, name)) if len(batch) > 0: ret.append(batch) return ret def plot_one_box(x, img, color=None, label=None, line_thickness=None): """ description: Plots one bounding box on image img, this function comes from YoLov5 project. param: x: a box likes [x1,y1,x2,y2] img: a opencv image object color: color to draw rectangle, such as (0,255,0) label: str line_thickness: int return: no return """ tl = ( line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1 ) # line/font thickness color = color or [random.randint(0, 255) for _ in range(3)] c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3])) cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA) if label: tf = max(tl - 1, 1) # font thickness t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0] c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3 cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA) # filled cv2.putText( img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA, ) class YoLov5TRT(object): """ description: A YOLOv5 class that warps TensorRT ops, preprocess and postprocess ops. """ def __init__(self, engine_file_path): # Create a Context on this device, self.ctx = cuda.Device(0).make_context() stream = cuda.Stream() TRT_LOGGER = trt.Logger(trt.Logger.INFO) runtime = trt.Runtime(TRT_LOGGER) # Deserialize the engine from file with open(engine_file_path, "rb") as f: engine = runtime.deserialize_cuda_engine(f.read()) context = engine.create_execution_context() host_inputs = [] cuda_inputs = [] host_outputs = [] cuda_outputs = [] bindings = [] for binding in engine: # print('bingding:', binding, engine.get_binding_shape(binding)) size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size dtype = trt.nptype(engine.get_binding_dtype(binding)) # Allocate host and device buffers host_mem = cuda.pagelocked_empty(size, dtype) cuda_mem = cuda.mem_alloc(host_mem.nbytes) # Append the device buffer to device bindings. bindings.append(int(cuda_mem)) # Append to the appropriate list. if engine.binding_is_input(binding): self.input_w = engine.get_binding_shape(binding)[-1] self.input_h = engine.get_binding_shape(binding)[-2] host_inputs.append(host_mem) cuda_inputs.append(cuda_mem) else: host_outputs.append(host_mem) cuda_outputs.append(cuda_mem) # Store self.stream = stream self.context = context self.engine = engine self.host_inputs = host_inputs self.cuda_inputs = cuda_inputs self.host_outputs = host_outputs self.cuda_outputs = cuda_outputs self.bindings = bindings self.batch_size = 1 def infer(self, raw_image_generator): # threading.Thread.__init__(self) # Make self the active context, pushing it on top of the context stack. self.ctx.push() # Restore stream = self.stream context = self.context engine = self.engine host_inputs = self.host_inputs cuda_inputs = self.cuda_inputs host_outputs = self.host_outputs cuda_outputs = self.cuda_outputs bindings = self.bindings # Do image preprocess batch_image_raw = [] batch_origin_h = [] batch_origin_w = [] batch_input_image = np.empty(shape=[self.batch_size, 3, self.input_h, self.input_w]) input_image, image_raw, origin_h, origin_w = self.preprocess_image(raw_image_generator) batch_image_raw.append(image_raw) batch_origin_h.append(origin_h) batch_origin_w.append(origin_w) np.copyto(batch_input_image[0], input_image) batch_input_image = np.ascontiguousarray(batch_input_image) # Copy input image to host buffer np.copyto(host_inputs[0], batch_input_image.ravel()) start = time.time() # Transfer input data to the GPU. cuda.memcpy_htod_async(cuda_inputs[0], host_inputs[0], stream) # Run inference. context.execute_async(batch_size=self.batch_size, bindings=bindings, stream_handle=stream.handle) # Transfer predictions back from the GPU. cuda.memcpy_dtoh_async(host_outputs[0], cuda_outputs[0], stream) # Synchronize the stream stream.synchronize() end = time.time() # Remove any context from the top of the context stack, deactivating it. self.ctx.pop() # Here we use the first row of output in that batch_size = 1 output = host_outputs[0] # Do postprocess for i in range(self.batch_size): result_boxes, result_scores, result_classid = self.post_process(output[i * 6001: (i + 1) * 6001], batch_origin_h[i], batch_origin_w[i]) return result_boxes, result_scores, result_classid, end - start # # Draw rectangles and labels on the original image # for j in range(len(result_boxes)): # box = result_boxes[j] # plot_one_box( # box, # batch_image_raw[i], # label="{}:{:.2f}".format( # categories[int(result_classid[j])], result_scores[j] # ), # ) def destroy(self): # Remove any context from the top of the context stack, deactivating it. self.ctx.pop() def get_raw_image(self, image_path_batch): """ description: Read an image from image path """ for img_path in image_path_batch: yield cv2.imread(img_path) def get_raw_image_zeros(self, image_path_batch=None): """ description: Ready data for warmup """ for _ in range(self.batch_size): yield np.zeros([self.input_h, self.input_w, 3], dtype=np.uint8) def preprocess_image(self, raw_bgr_image): """ description: Convert BGR image to RGB, resize and pad it to target size, normalize to [0,1], transform to NCHW format. param: input_image_path: str, image path return: image: the processed image image_raw: the original image h: original height w: original width """ image_raw = raw_bgr_image h, w, c = image_raw.shape image = cv2.cvtColor(image_raw, cv2.COLOR_BGR2RGB) # Calculate widht and height and paddings r_w = self.input_w / w r_h = self.input_h / h if r_h > r_w: tw = self.input_w th = int(r_w * h) tx1 = tx2 = 0 ty1 = int((self.input_h - th) / 2) ty2 = self.input_h - th - ty1 else: tw = int(r_h * w) th = self.input_h tx1 = int((self.input_w - tw) / 2) tx2 = self.input_w - tw - tx1 ty1 = ty2 = 0 # Resize the image with long side while maintaining ratio image = cv2.resize(image, (tw, th)) # Pad the short side with (128,128,128) image = cv2.copyMakeBorder( image, ty1, ty2, tx1, tx2, cv2.BORDER_CONSTANT, None, (128, 128, 128) ) image = image.astype(np.float32) # Normalize to [0,1] image /= 255.0 # HWC to CHW format: image = np.transpose(image, [2, 0, 1]) # CHW to NCHW format image = np.expand_dims(image, axis=0) # Convert the image to row-major order, also known as "C order": image = np.ascontiguousarray(image) return image, image_raw, h, w def xywh2xyxy(self, origin_h, origin_w, x): """ description: Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right param: origin_h: height of original image origin_w: width of original image x: A boxes numpy, each row is a box [center_x, center_y, w, h] return: y: A boxes numpy, each row is a box [x1, y1, x2, y2] """ y = np.zeros_like(x) r_w = self.input_w / origin_w r_h = self.input_h / origin_h if r_h > r_w: y[:, 0] = x[:, 0] - x[:, 2] / 2 y[:, 2] = x[:, 0] + x[:, 2] / 2 y[:, 1] = x[:, 1] - x[:, 3] / 2 - (self.input_h - r_w * origin_h) / 2 y[:, 3] = x[:, 1] + x[:, 3] / 2 - (self.input_h - r_w * origin_h) / 2 y /= r_w else: y[:, 0] = x[:, 0] - x[:, 2] / 2 - (self.input_w - r_h * origin_w) / 2 y[:, 2] = x[:, 0] + x[:, 2] / 2 - (self.input_w - r_h * origin_w) / 2 y[:, 1] = x[:, 1] - x[:, 3] / 2 y[:, 3] = x[:, 1] + x[:, 3] / 2 y /= r_h return y def post_process(self, output, origin_h, origin_w): """ description: postprocess the prediction param: output: A numpy likes [num_boxes,cx,cy,w,h,conf,cls_id, cx,cy,w,h,conf,cls_id, ...] origin_h: height of original image origin_w: width of original image return: result_boxes: finally boxes, a boxes numpy, each row is a box [x1, y1, x2, y2] result_scores: finally scores, a numpy, each element is the score correspoing to box result_classid: finally classid, a numpy, each element is the classid correspoing to box """ # Get the num of boxes detected num = int(output[0]) # Reshape to a two dimentional ndarray pred = np.reshape(output[1:], (-1, 6))[:num, :] # Do nms boxes = self.non_max_suppression(pred, origin_h, origin_w, conf_thres=CONF_THRESH, nms_thres=IOU_THRESHOLD) result_boxes = boxes[:, :4] if len(boxes) else np.array([]) result_scores = boxes[:, 4] if len(boxes) else np.array([]) result_classid = boxes[:, 5] if len(boxes) else np.array([]) return result_boxes, result_scores, result_classid def bbox_iou(self, box1, box2, x1y1x2y2=True): """ description: compute the IoU of two bounding boxes param: box1: A box coordinate (can be (x1, y1, x2, y2) or (x, y, w, h)) box2: A box coordinate (can be (x1, y1, x2, y2) or (x, y, w, h)) x1y1x2y2: select the coordinate format return: iou: computed iou """ if not x1y1x2y2: # Transform from center and width to exact coordinates b1_x1, b1_x2 = box1[:, 0] - box1[:, 2] / 2, box1[:, 0] + box1[:, 2] / 2 b1_y1, b1_y2 = box1[:, 1] - box1[:, 3] / 2, box1[:, 1] + box1[:, 3] / 2 b2_x1, b2_x2 = box2[:, 0] - box2[:, 2] / 2, box2[:, 0] + box2[:, 2] / 2 b2_y1, b2_y2 = box2[:, 1] - box2[:, 3] / 2, box2[:, 1] + box2[:, 3] / 2 else: # Get the coordinates of bounding boxes b1_x1, b1_y1, b1_x2, b1_y2 = box1[:, 0], box1[:, 1], box1[:, 2], box1[:, 3] b2_x1, b2_y1, b2_x2, b2_y2 = box2[:, 0], box2[:, 1], box2[:, 2], box2[:, 3] # Get the coordinates of the intersection rectangle inter_rect_x1 = np.maximum(b1_x1, b2_x1) inter_rect_y1 = np.maximum(b1_y1, b2_y1) inter_rect_x2 = np.minimum(b1_x2, b2_x2) inter_rect_y2 = np.minimum(b1_y2, b2_y2) # Intersection area inter_area = np.clip(inter_rect_x2 - inter_rect_x1 + 1, 0, None) * \ np.clip(inter_rect_y2 - inter_rect_y1 + 1, 0, None) # Union Area b1_area = (b1_x2 - b1_x1 + 1) * (b1_y2 - b1_y1 + 1) b2_area = (b2_x2 - b2_x1 + 1) * (b2_y2 - b2_y1 + 1) iou = inter_area / (b1_area + b2_area - inter_area + 1e-16) return iou def non_max_suppression(self, prediction, origin_h, origin_w, conf_thres=0.5, nms_thres=0.4): """ description: Removes detections with lower object confidence score than 'conf_thres' and performs Non-Maximum Suppression to further filter detections. param: prediction: detections, (x1, y1, x2, y2, conf, cls_id) origin_h: original image height origin_w: original image width conf_thres: a confidence threshold to filter detections nms_thres: a iou threshold to filter detections return: boxes: output after nms with the shape (x1, y1, x2, y2, conf, cls_id) """ # Get the boxes that score > CONF_THRESH boxes = prediction[prediction[:, 4] >= conf_thres] # Trandform bbox from [center_x, center_y, w, h] to [x1, y1, x2, y2] boxes[:, :4] = self.xywh2xyxy(origin_h, origin_w, boxes[:, :4]) # clip the coordinates boxes[:, 0] = np.clip(boxes[:, 0], 0, origin_w -1) boxes[:, 2] = np.clip(boxes[:, 2], 0, origin_w -1) boxes[:, 1] = np.clip(boxes[:, 1], 0, origin_h -1) boxes[:, 3] = np.clip(boxes[:, 3], 0, origin_h -1) # Object confidence confs = boxes[:, 4] # Sort by the confs boxes = boxes[np.argsort(-confs)] # Perform non-maximum suppression keep_boxes = [] while boxes.shape[0]: large_overlap = self.bbox_iou(np.expand_dims(boxes[0, :4], 0), boxes[:, :4]) > nms_thres label_match = boxes[0, -1] == boxes[:, -1] # Indices of boxes with lower confidence scores, large IOUs and matching labels invalid = large_overlap & label_match keep_boxes += [boxes[0]] boxes = boxes[~invalid] boxes = np.stack(keep_boxes, 0) if len(keep_boxes) else np.array([]) return boxes # class inferThread(threading.Thread): # def __init__(self, yolov5_wrapper, image_path_batch): # threading.Thread.__init__(self) # self.yolov5_wrapper = yolov5_wrapper # self.image_path_batch = image_path_batch # def run(self): # batch_image_raw, use_time = self.yolov5_wrapper.infer(self.yolov5_wrapper.get_raw_image(self.image_path_batch)) # for i, img_path in enumerate(self.image_path_batch): # parent, filename = os.path.split(img_path) # save_name = os.path.join('output', filename) # # Save image # cv2.imwrite(save_name, batch_image_raw[i]) # print('input->{}, time->{:.2f}ms, saving into output/'.format(self.image_path_batch, use_time * 1000)) class warmUpThread(threading.Thread): def __init__(self, yolov5_wrapper): threading.Thread.__init__(self) self.yolov5_wrapper = yolov5_wrapper def run(self): batch_image_raw, use_time = self.yolov5_wrapper.infer(self.yolov5_wrapper.get_raw_image_zeros()) print('warm_up->{}, time->{:.2f}ms'.format(batch_image_raw[0].shape, use_time * 1000)) # def get_max_area(img , boxs, result_classid, result_scores): # max_area = 0 # cls_id = 10 # score_1=0 # for box, class_id, score in zip(boxs, result_classid, result_scores): # x1, y1, x2, y2 = box # area = (x2-x1)*(y2-y1) # if area > 100: # cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), (255, 0, 0), # thickness=3, lineType=cv2.LINE_AA) # if area > max_area: # max_area = int(area) # cls_id = int(class_id) # score_1=score # return cls_id, max_area, score_1 def get_max_area(boxs, result_classid , result_scores): max_area = 0 cls_id = 0 _score = 0 max_box=[] for box, class_id ,score in zip(boxs, result_classid,result_scores): x1, y1, x2, y2 = box area = (y2-y1)*(y2-y1) if area > max_area: max_area = int(area) cls_id = int(class_id) max_box = box _score = score return cls_id, max_area, max_box ,_score def draw_rerectangle(frame,box,cls,score,area): x1, y1, x2, y2 = box cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (255, 0, 0), thickness=3, lineType=cv2.LINE_AA) cv2.putText(frame, str("cls_id:%s" % cls), (20, 60), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2) cv2.putText(frame, str("area:%d" % area), (20, 120), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2) cv2.putText(frame, str("score:%f" % score), (20, 90), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2) if __name__ == "__main__": # load custom plugin and engine PLUGIN_LIBRARY = "/media/jian/My Passport/资料/car/ai_control-FULL-3.0(14)/ai_control-FULL-3.0/src/opencv_detection/scripts/small_yolo/libmyplugins.so" engine_file_path = "/media/jian/My Passport/资料/car/ai_control-FULL-3.0(14)/ai_control-FULL-3.0/src/opencv_detection/scripts/small_yolo/small.engine" if len(sys.argv) > 1: engine_file_path = sys.argv[1] if len(sys.argv) > 2: PLUGIN_LIBRARY = sys.argv[2] ctypes.CDLL(PLUGIN_LIBRARY) # load coco labels categories = ["danger","side_walk","speed","speed_limit","turn_left","turn_right","lane_change"] # a YoLov5TRT instance yolov5_wrapper = YoLov5TRT(engine_file_path) cap = cv2.VideoCapture(0,cv2.CAP_V4L2) try: while 1: ret, img = cap.read() if not ret: continue image = cv2.resize(img, (320, 240), interpolation=cv2.INTER_AREA) #w*h # print(image.shape) # frame = image[:224, 96:320, :] #h*w result_boxes, result_scores, result_classid, use_time = yolov5_wrapper.infer( image) cls_id = 10 area = 0 if (len(result_classid) != 0): cls_id, area, score = get_max_area(image,result_boxes, result_classid,result_scores) cv2.putText(image, str("cls_id:%s" % categories[cls_id]), (20, 100), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2) cv2.putText(image, str("area:%d" % area), (20, 160), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2) cv2.putText(image, str("score:%f" % score), (20, 130), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2) cv2.imshow("frame", image) if cv2.waitKey(1) & 0xFF == ord('q'): break finally: # destroy the instance cap.release() cv2.destroyAllWindows() yolov5_wrapper.destroy()

# 对元学习器进行调参 meta_grid_search = GridSearchCV(estimator=meta_estimator, param_grid=meta_param_grid, cv=3, scoring='r2') # 使用基学习器的预测作为元学习器的输入特征 base_predictions_train = np.column_stack( [estimator.predict(X_train_scaled) for _, estimator in best_base_estimators]) meta_grid_search.fit(base_predictions_train, y_train) best_meta_estimator = meta_grid_search.best_estimator_ # 自定义堆叠模型,因为 StackingRegressor 不支持直接对元学习器调参后的使用 class CustomStackingModel: def __init__(self, base_estimators, meta_estimator): self.base_estimators = base_estimators self.meta_estimator = meta_estimator def fit(self, X, y): base_predictions_train = np.column_stack( [estimator.predict(X) for _, estimator in self.base_estimators]) self.meta_estimator.fit(base_predictions_train, y) def predict(self, X): base_predictions_test = np.column_stack([estimator.predict(X) for _, estimator in self.base_estimators]) return self.meta_estimator.predict(base_predictions_test) stacking_model = CustomStackingModel(best_base_estimators, best_meta_estimator) stacking_model.fit(X_train_scaled, y_train) # 评估堆叠模型的表现 train_pred_stacking = stacking_model.predict(X_train_scaled) test_pred_stacking = stacking_model.predict(X_test_scaled) r2_train_stack = r2_score(y_train, train_pred_stacking) mse_train_stack = mean_squared_error(y_train, train_pred_stacking) mae_train_stack = mean_absolute_error(y_train, train_pred_stacking) r2_test_stack = r2_score(y_test, test_pred_stacking) mse_test_stack = mean_squared_error(y_test, test_pred_stacking) mae_test_stack = mean_absolute_error(y_test, test_pred_stacking) return r2_train_stack, r2_test_stack, mse_train_stack, mse_test_stack, mae_train_stack, mae_test_stack # 主程序 results = [] for i in range(100): random_seed = random.randint(1, 10000) data = pd.read_csv('Ecorr-7特征 +2.csv') # 确保文件名和路径正确 r2_train, r2_test, mse_train, mse_test, mae_train, mae_test = stacking_with_tuning(random_seed, data) results.append({ 'Random Seed': random_seed, 'Train R^2': r2_train, 'Test R^2': r2_test, 'Train MSE': mse_train, 'Test MSE': mse_test, 'MAE Train': mae_train, 'MAE Test': mae_test }) print(f'Progress: {i + 1}%') # 进度显示 # 创建 DataFrame df = pd.DataFrame(results) # 写入 CSV 文件 df.to_csv('XGB——stacking_results_with_tuning_xgb_meta.csv', index=False) print("Data saved to XGB——stacking_results_with_tuning_xgb_meta.csv")我注意到代码中写元学习器3折交叉验证调参,一般来说元学习器不用交叉验证

from data import COCODetection, get_label_map, MEANS, COLORS from yolact import Yolact from utils.augmentations import BaseTransform, FastBaseTransform, Resize from utils.functions import MovingAverage, ProgressBar from layers.box_utils import jaccard, center_size, mask_iou from utils import timer from utils.functions import SavePath from layers.output_utils import postprocess, undo_image_transformation import pycocotools from data import cfg, set_cfg, set_dataset import numpy as np import torch import torch.backends.cudnn as cudnn from torch.autograd import Variable import argparse import time import random import cProfile import pickle import json import os from collections import defaultdict from pathlib import Path from collections import OrderedDict from PIL import Image import matplotlib.pyplot as plt import cv2 def str2bool(v): if v.lower() in ('yes', 'true', 't', 'y', '1'): return True elif v.lower() in ('no', 'false', 'f', 'n', '0'): return False else: raise argparse.ArgumentTypeError('Boolean value expected.') def parse_args(argv=None): parser = argparse.ArgumentParser( description='YOLACT COCO Evaluation') parser.add_argument('--trained_model', default='weights/yolact_base_105_101798_interrupt.pth', type=str, help='Trained state_dict file path to open. If "interrupt", this will open the interrupt file.') parser.add_argument('--top_k', default=5, type=int, help='Further restrict the number of predictions to parse') parser.add_argument('--cuda', default=True, type=str2bool, help='Use cuda to evaulate model') parser.add_argument('--fast_nms', default=True, type=str2bool, help='Whether to use a faster, but not entirely correct version of NMS.') parser.add_argument('--cross_class_nms', default=False, type=str2bool, help='Whether compute NMS cross-class or per-class.') parser.add_argument('--display_masks', default=True, type=str2bool, help='Whether or not to display masks over bounding boxes') parser.add_argument('--display_bboxes', default=True, type=str2bool, help='Whether or not to display bboxes around masks') parser.add_argument('--display_text', default=True, type=str2bool, help='Whether or not to display text (class [score])') parser.add_argument('--display_scores', default=True, type=str2bool, help='Whether or not to display scores in addition to classes') parser.add_argument('--display', dest='display', action='store_true', help='Display qualitative results instead of quantitative ones.') parser.add_argument('--shuffle', dest='shuffle', action='store_true', help='Shuffles the images when displaying them. Doesn\'t have much of an effect when display is off though.') parser.add_argument('--ap_data_file', default='results/ap_data.pkl', type=str, help='In quantitative mode, the file to save detections before calculating mAP.') parser.add_argument('--resume', dest='resume', action='store_true', help='If display not set, this resumes mAP calculations from the ap_data_file.') parser.add_argument('--max_images', default=-1, type=int, help='The maximum number of images from the dataset to consider. Use -1 for all.') parser.add_argument('--output_coco_json', dest='output_coco_json', action='store_true', help='If display is not set, instead of processing IoU values, this just dumps detections into the coco json file.') parser.add_argument('--bbox_det_file', default='results/bbox_detections.json', type=str, help='The output file for coco bbox results if --coco_results is set.') parser.add_argument('--mask_det_file', default='results/mask_detections.json', type=str, help='The output file for coco mask results if --coco_results is set.') parser.add_argument('--config', default=None, help='The config object to use.') parser.add_argument('--output_web_json', dest='output_web_json', action='store_true', help='If display is not set, instead of processing IoU values, this dumps detections for usage with the detections viewer web thingy.') parser.add_argument('--web_det_path', default='web/dets/', type=str, help='If output_web_json is set, this is the path to dump detections into.') parser.add_argument('--no_bar', dest='no_bar', action='store_true', help='Do not output the status bar. This is useful for when piping to a file.') parser.add_argument('--display_lincomb', default=False, type=str2bool, help='If the config uses lincomb masks, output a visualization of how those masks are created.') parser.add_argument('--benchmark', default=False, dest='benchmark', action='store_true', help='Equivalent to running display mode but without displaying an image.') parser.add_argument('--no_sort', default=False, dest='no_sort', action='store_true', help='Do not sort images by hashed image ID.') parser.add_argument('--seed', default=None, type=int, help='The seed to pass into random.seed. Note: this is only really for the shuffle and does not (I think) affect cuda stuff.') parser.add_argument('--mask_proto_debug', default=False, dest='mask_proto_debug', action='store_true', help='Outputs stuff for scripts/compute_mask.py.') parser.add_argument('--no_crop', default=False, dest='crop', action='store_false', help='Do not crop output masks with the predicted bounding box.') parser.add_argument('--image', default=None, type=str, help='A path to an image to use for display.') parser.add_argument('--images', default='E:/yolact-master/coco/images/train2017', type=str, help='Input and output paths separated by a colon.') parser.add_argument('--video', default=None, type=str, help='A path to a video to evaluate on. Passing in a number will use that index webcam.') parser.add_argument('--video_multiframe', default=1, type=int, help='The number of frames to evaluate in parallel to make videos play at higher fps.') parser.add_argument('--score_threshold', default=0.15, type=float, help='Detections with a score under this threshold will not be considered. This currently only works in display mode.') parser.add_argument('--dataset', default=None, type=str, help='If specified, override the dataset specified in the config with this one (example: coco2017_dataset).') parser.add_argument('--detect', default=False, dest='detect', action='store_true', help='Don\'t evauluate the mask branch at all and only do object detection. This only works for --display and --benchmark.') parser.add_argument('--display_fps', default=False, dest='display_fps', action='store_true', help='When displaying / saving video, draw the FPS on the frame') parser.add_argument('--emulate_playback', default=False, dest='emulate_playback', action='store_true', help='When saving a video, emulate the framerate that you\'d get running in real-time mode.') parser.set_defaults(no_bar=False, display=False, resume=False, output_coco_json=False, output_web_json=False, shuffle=False, benchmark=False, no_sort=False, no_hash=False, mask_proto_debug=False, crop=True, detect=False, display_fps=False, emulate_playback=False) global args args = parser.parse_args(argv) if args.output_web_json: args.output_coco_json = True if args.seed is not None: random.seed(args.seed) iou_thresholds = [x / 100 for x in range(50, 100, 5)] coco_cats = {} # Call prep_coco_cats to fill this coco_cats_inv = {} color_cache = defaultdict(lambda: {}) def prep_display(dets_out, img, h, w, undo_transform=True, class_color=False, mask_alpha=0.45, fps_str=''): """ Note: If undo_transform=False then im_h and im_w are allowed to be None. """ if undo_transform: img_numpy = undo_image_transformation(img, w, h) img_gpu = torch.Tensor(img_numpy).cuda() else: img_gpu = img / 255.0 h, w, _ = img.shape with timer.env('Postprocess'): save = cfg.rescore_bbox cfg.rescore_bbox = True t = postprocess(dets_out, w, h, visualize_lincomb = args.display_lincomb, crop_masks = args.crop, score_threshold = args.score_threshold) cfg.rescore_bbox = save with timer.env('Copy'): idx = t[1].argsort(0, descending=True)[:args.top_k] if cfg.eval_mask_branch: # Masks are drawn on the GPU, so don't copy masks = t[3][idx] classes, scores, boxes = [x[idx].cpu().numpy() for x in t[:3]] num_dets_to_consider = min(args.top_k, classes.shape[0]) for j in range(num_dets_to_consider): if scores[j] < args.score_threshold: num_dets_to_consider = j break # Quick and dirty lambda for selecting the color for a particular index # Also keeps track of a per-gpu color cache for maximum speed def get_color(j, on_gpu=None): global color_cache color_idx = (classes[j] * 5 if class_color else j * 5) % len(COLORS) if on_gpu is not None and color_idx in color_cache[on_gpu]: return color_cache[on_gpu][color_idx] else: color = COLORS[color_idx] if not undo_transform: # The image might come in as RGB or BRG, depending color = (color[2], color[1], color[0]) if on_gpu is not None: color = torch.Tensor(color).to(on_gpu).float() / 255. color_cache[on_gpu][color_idx] = color return color # First, draw the masks on the GPU where we can do it really fast # Beware: very fast but possibly unintelligible mask-drawing code ahead # I wish I had access to OpenGL or Vulkan but alas, I guess Pytorch tensor operations will have to suffice if args.display_masks and cfg.eval_mask_branch and num_dets_to_consider > 0: # After this, mask is of size [num_dets, h, w, 1] masks = masks[:num_dets_to_consider, :, :, None] # Prepare the RGB images for each mask given their color (size [num_dets, h, w, 1]) colors = torch.cat([get_color(j, on_gpu=img_gpu.device.index).view(1, 1, 1, 3) for j in range(num_dets_to_consider)], dim=0) masks_color = masks.repeat(1, 1, 1, 3) * colors * mask_alpha # This is 1 everywhere except for 1-mask_alpha where the mask is inv_alph_masks = masks * (-mask_alpha) + 1 # I did the math for this on pen and paper. This whole block should be equivalent to: # for j in range(num_dets_to_consider): # img_gpu = img_gpu * inv_alph_masks[j] + masks_color[j] masks_color_summand = masks_color[0] if num_dets_to_consider > 1: inv_alph_cumul = inv_alph_masks[:(num_dets_to_consider-1)].cumprod(dim=0) masks_color_cumul = masks_color[1:] * inv_alph_cumul masks_color_summand += masks_color_cumul.sum(dim=0) img_gpu = img_gpu * inv_alph_masks.prod(dim=0) + masks_color_summand if args.display_fps: # Draw the box for the fps on the GPU font_face = cv2.FONT_HERSHEY_DUPLEX font_scale = 0.6 font_thickness = 1 text_w, text_h = cv2.getTextSize(fps_str, font_face, font_scale, font_thickness)[0] img_gpu[0:text_h+8, 0:text_w+8] *= 0.6 # 1 - Box alpha # Then draw the stuff that needs to be done on the cpu # Note, make sure this is a uint8 tensor or opencv will not anti alias text for whatever reason img_numpy = (img_gpu * 255).byte().cpu().numpy() if args.display_fps: # Draw the text on the CPU text_pt = (4, text_h + 2) text_color = [255, 255, 255] cv2.putText(img_numpy, fps_str, text_pt, font_face, font_scale, text_color, font_thickness, cv2.LINE_AA) if num_dets_to_consider == 0: return img_numpy if args.display_text or args.display_bboxes: for j in reversed(range(num_dets_to_consider)): x1, y1, x2, y2 = boxes[j, :] color = get_color(j) score = scores[j] if args.display_bboxes: cv2.rectangle(img_numpy, (x1, y1), (x2, y2), color, 1) if args.display_text: _class = cfg.dataset.class_names[classes[j]] text_str = '%s: %.2f' % (_class, score) if args.display_scores else _class font_face = cv2.FONT_HERSHEY_DUPLEX font_scale = 0.6 font_thickness = 1 text_w, text_h = cv2.getTextSize(text_str, font_face, font_scale, font_thickness)[0] text_pt = (x1, y1 - 3) text_color = [255, 255, 255] cv2.rectangle(img_numpy, (x1, y1), (x1 + text_w, y1 - text_h - 4), color, -1) cv2.putText(img_numpy, text_str, text_pt, font_face, font_scale, text_color, font_thickness, cv2.LINE_AA) return img_numpy def prep_benchmark(dets_out, h, w): with timer.env('Postprocess'): t = postprocess(dets_out, w, h, crop_masks=args.crop, score_threshold=args.score_threshold) with timer.env('Copy'): classes, scores, boxes, masks = [x[:args.top_k] for x in t] if isinstance(scores, list): box_scores = scores[0].cpu().numpy() mask_scores = scores[1].cpu().numpy() else: scores = scores.cpu().numpy() classes = classes.cpu().numpy() boxes = boxes.cpu().numpy() masks = masks.cpu().numpy() with timer.env('Sync'): # Just in case torch.cuda.synchronize() def prep_coco_cats(): """ Prepare inverted table for category id lookup given a coco cats object. """ for coco_cat_id, transformed_cat_id_p1 in get_label_map().items(): transformed_cat_id = transformed_cat_id_p1 - 1 coco_cats[transformed_cat_id] = coco_cat_id coco_cats_inv[coco_cat_id] = transformed_cat_id def get_coco_cat(transformed_cat_id): """ transformed_cat_id is [0,80) as indices in cfg.dataset.class_names """ return coco_cats[transformed_cat_id] def get_transformed_cat(coco_cat_id): """ transformed_cat_id is [0,80) as indices in cfg.dataset.class_names """ return coco_cats_inv[coco_cat_id] class Detections: def __init__(self): self.bbox_data = [] self.mask_data = [] def add_bbox(self, image_id:int, category_id:int, bbox:list, score:float): """ Note that bbox should be a list or tuple of (x1, y1, x2, y2) """ bbox = [bbox[0], bbox[1], bbox[2]-bbox[0], bbox[3]-bbox[1]] # Round to the nearest 10th to avoid huge file sizes, as COCO suggests bbox = [round(float(x)*10)/10 for x in bbox] self.bbox_data.append({ 'image_id': int(image_id), 'category_id': get_coco_cat(int(category_id)), 'bbox': bbox, 'score': float(score) }) def add_mask(self, image_id:int, category_id:int, segmentation:np.ndarray, score:float): """ The segmentation should be the full mask, the size of the image and with size [h, w]. """ rle = pycocotools.mask.encode(np.asfortranarray(segmentation.astype(np.uint8))) rle['counts'] = rle['counts'].decode('ascii') # json.dump doesn't like bytes strings self.mask_data.append({ 'image_id': int(image_id), 'category_id': get_coco_cat(int(category_id)), 'segmentation': rle, 'score': float(score) }) def dump(self): dump_arguments = [ (self.bbox_data, args.bbox_det_file), (self.mask_data, args.mask_det_file) ] for data, path in dump_arguments: with open(path, 'w') as f: json.dump(data, f) def dump_web(self): """ Dumps it in the format for my web app. Warning: bad code ahead! """ config_outs = ['preserve_aspect_ratio', 'use_prediction_module', 'use_yolo_regressors', 'use_prediction_matching', 'train_masks'] output = { 'info' : { 'Config': {key: getattr(cfg, key) for key in config_outs}, } } image_ids = list(set([x['image_id'] for x in self.bbox_data])) image_ids.sort() image_lookup = {_id: idx for idx, _id in enumerate(image_ids)} output['images'] = [{'image_id': image_id, 'dets': []} for image_id in image_ids] # These should already be sorted by score with the way prep_metrics works. for bbox, mask in zip(self.bbox_data, self.mask_data): image_obj = output['images'][image_lookup[bbox['image_id']]] image_obj['dets'].append({ 'score': bbox['score'], 'bbox': bbox['bbox'], 'category': cfg.dataset.class_names[get_transformed_cat(bbox['category_id'])], 'mask': mask['segmentation'], }) with open(os.path.join(args.web_det_path, '%s.json' % cfg.name), 'w') as f: json.dump(output, f) def _mask_iou(mask1, mask2, iscrowd=False): with timer.env('Mask IoU'): ret = mask_iou(mask1, mask2, iscrowd) return ret.cpu() def _bbox_iou(bbox1, bbox2, iscrowd=False): with timer.env('BBox IoU'): ret = jaccard(bbox1, bbox2, iscrowd) return ret.cpu() def prep_metrics(ap_data, dets, img, gt, gt_masks, h, w, num_crowd, image_id, detections:Detections=None): """ Returns a list of APs for this image, with each element being for a class """ if not args.output_coco_json: with timer.env('Prepare gt'): gt_boxes = torch.Tensor(gt[:, :4]) gt_boxes[:, [0, 2]] *= w gt_boxes[:, [1, 3]] *= h gt_classes = list(gt[:, 4].astype(int)) gt_masks = torch.Tensor(gt_masks).view(-1, h*w) if num_crowd > 0: split = lambda x: (x[-num_crowd:], x[:-num_crowd]) crowd_boxes , gt_boxes = split(gt_boxes) crowd_masks , gt_masks = split(gt_masks) crowd_classes, gt_classes = split(gt_classes) with timer.env('Postprocess'): classes, scores, boxes, masks = postprocess(dets, w, h, crop_masks=args.crop, score_threshold=args.score_threshold) if classes.size(0) == 0: return classes = list(classes.cpu().numpy().astype(int)) if isinstance(scores, list): box_scores = list(scores[0].cpu().numpy().astype(float)) mask_scores = list(scores[1].cpu().numpy().astype(float)) else: scores = list(scores.cpu().numpy().astype(float)) box_scores = scores mask_scores = scores masks = masks.view(-1, h*w).cuda() boxes = boxes.cuda() if args.output_coco_json: with timer.env('JSON Output'): boxes = boxes.cpu().numpy() masks = masks.view(-1, h, w).cpu().numpy() for i in range(masks.shape[0]): # Make sure that the bounding box actually makes sense and a mask was produced if (boxes[i, 3] - boxes[i, 1]) * (boxes[i, 2] - boxes[i, 0]) > 0: detections.add_bbox(image_id, classes[i], boxes[i,:], box_scores[i]) detections.add_mask(image_id, classes[i], masks[i,:,:], mask_scores[i]) return with timer.env('Eval Setup'): num_pred = len(classes) num_gt = len(gt_classes) mask_iou_cache = _mask_iou(masks, gt_masks) bbox_iou_cache = _bbox_iou(boxes.float(), gt_boxes.float()) if num_crowd > 0: crowd_mask_iou_cache = _mask_iou(masks, crowd_masks, iscrowd=True) crowd_bbox_iou_cache = _bbox_iou(boxes.float(), crowd_boxes.float(), iscrowd=True) else: crowd_mask_iou_cache = None crowd_bbox_iou_cache = None box_indices = sorted(range(num_pred), key=lambda i: -box_scores[i]) mask_indices = sorted(box_indices, key=lambda i: -mask_scores[i]) iou_types = [ ('box', lambda i,j: bbox_iou_cache[i, j].item(), lambda i,j: crowd_bbox_iou_cache[i,j].item(), lambda i: box_scores[i], box_indices), ('mask', lambda i,j: mask_iou_cache[i, j].item(), lambda i,j: crowd_mask_iou_cache[i,j].item(), lambda i: mask_scores[i], mask_indices) ] timer.start('Main loop') for _class in set(classes + gt_classes): ap_per_iou = [] num_gt_for_class = sum([1 for x in gt_classes if x == _class]) for iouIdx in range(len(iou_thresholds)): iou_threshold = iou_thresholds[iouIdx] for iou_type, iou_func, crowd_func, score_func, indices in iou_types: gt_used = [False] * len(gt_classes) ap_obj = ap_data[iou_type][iouIdx][_class] ap_obj.add_gt_positives(num_gt_for_class) for i in indices: if classes[i] != _class: continue max_iou_found = iou_threshold max_match_idx = -1 for j in range(num_gt): if gt_used[j] or gt_classes[j] != _class: continue iou = iou_func(i, j) if iou > max_iou_found: max_iou_found = iou max_match_idx = j if max_match_idx >= 0: gt_used[max_match_idx] = True ap_obj.push(score_func(i), True) else: # If the detection matches a crowd, we can just ignore it matched_crowd = False if num_crowd > 0: for j in range(len(crowd_classes)): if crowd_classes[j] != _class: continue iou = crowd_func(i, j) if iou > iou_threshold: matched_crowd = True break # All this crowd code so that we can make sure that our eval code gives the # same result as COCOEval. There aren't even that many crowd annotations to # begin with, but accuracy is of the utmost importance. if not matched_crowd: ap_obj.push(score_func(i), False) timer.stop('Main loop') class APDataObject: """ Stores all the information necessary to calculate the AP for one IoU and one class. Note: I type annotated this because why not. """ def __init__(self): self.data_points = [] self.num_gt_positives = 0 def push(self, score:float, is_true:bool): self.data_points.append((score, is_true)) def add_gt_positives(self, num_positives:int): """ Call this once per image. """ self.num_gt_positives += num_positives def is_empty(self) -> bool: return len(self.data_points) == 0 and self.num_gt_positives == 0 def get_ap(self) -> float: """ Warning: result not cached. """ if self.num_gt_positives == 0: return 0 # Sort descending by score self.data_points.sort(key=lambda x: -x[0]) precisions = [] recalls = [] num_true = 0 num_false = 0 # Compute the precision-recall curve. The x axis is recalls and the y axis precisions. for datum in self.data_points: # datum[1] is whether the detection a true or false positive if datum[1]: num_true += 1 else: num_false += 1 precision = num_true / (num_true + num_false) recall = num_true / self.num_gt_positives precisions.append(precision) recalls.append(recall) # Smooth the curve by computing [max(precisions[i:]) for i in range(len(precisions))] # Basically, remove any temporary dips from the curve. # At least that's what I think, idk. COCOEval did it so I do too. for i in range(len(precisions)-1, 0, -1): if precisions[i] > precisions[i-1]: precisions[i-1] = precisions[i] # Compute the integral of precision(recall) d_recall from recall=0->1 using fixed-length riemann summation with 101 bars. y_range = [0] * 101 # idx 0 is recall == 0.0 and idx 100 is recall == 1.00 x_range = np.array([x / 100 for x in range(101)]) recalls = np.array(recalls) # I realize this is weird, but all it does is find the nearest precision(x) for a given x in x_range. # Basically, if the closest recall we have to 0.01 is 0.009 this sets precision(0.01) = precision(0.009). # I approximate the integral this way, because that's how COCOEval does it. indices = np.searchsorted(recalls, x_range, side='left') for bar_idx, precision_idx in enumerate(indices): if precision_idx < len(precisions): y_range[bar_idx] = precisions[precision_idx] # Finally compute the riemann sum to get our integral. # avg([precision(x) for x in 0:0.01:1]) return sum(y_range) / len(y_range) def badhash(x): """ Just a quick and dirty hash function for doing a deterministic shuffle based on image_id. Source: https://stackoverflow.com/questions/664014/what-integer-hash-function-are-good-that-accepts-an-integer-hash-key """ x = (((x >> 16) ^ x) * 0x045d9f3b) & 0xFFFFFFFF x = (((x >> 16) ^ x) * 0x045d9f3b) & 0xFFFFFFFF x = ((x >> 16) ^ x) & 0xFFFFFFFF return x def evalimage(net:Yolact, path:str, save_path:str=None): frame = torch.from_numpy(cv2.imread(path)).cuda().float() batch = FastBaseTransform()(frame.unsqueeze(0)) preds = net(batch) img_numpy = prep_display(preds, frame, None, None, undo_transform=False) if save_path is None: img_numpy = img_numpy[:, :, (2, 1, 0)] if save_path is None: plt.imshow(img_numpy) plt.title(path) plt.show() else: cv2.imwrite(save_path, img_numpy) def evalimages(net:Yolact, input_folder:str, output_folder:str): if not os.path.exists(output_folder): os.mkdir(output_folder) print() for p in Path(input_folder).glob('*'): path = str(p) name = os.path.basename(path) name = '.'.join(name.split('.')[:-1]) + '.png' out_path = os.path.join(output_folder, name) evalimage(net, path, out_path) print(path + ' -> ' + out_path) print('Done.') from multiprocessing.pool import ThreadPool from queue import Queue class CustomDataParallel(torch.nn.DataParallel): """ A Custom Data Parallel class that properly gathers lists of dictionaries. """ def gather(self, outputs, output_device): # Note that I don't actually want to convert everything to the output_device return sum(outputs, []) def evalvideo(net:Yolact, path:str, out_path:str=None): # If the path is a digit, parse it as a webcam index is_webcam = path.isdigit() # If the input image size is constant, this make things faster (hence why we can use it in a video setting). cudnn.benchmark = True if is_webcam: vid = cv2.VideoCapture(int(path)) else: vid = cv2.VideoCapture(path) if not vid.isOpened(): print('Could not open video "%s"' % path) exit(-1) target_fps = round(vid.get(cv2.CAP_PROP_FPS)) frame_width = round(vid.get(cv2.CAP_PROP_FRAME_WIDTH)) frame_height = round(vid.get(cv2.CAP_PROP_FRAME_HEIGHT)) if is_webcam: num_frames = float('inf') else: num_frames = round(vid.get(cv2.CAP_PROP_FRAME_COUNT)) net = CustomDataParallel(net).cuda() transform = torch.nn.DataParallel(FastBaseTransform()).cuda() frame_times = MovingAverage(100) fps = 0 frame_time_target = 1 / target_fps running = True fps_str = '' vid_done = False frames_displayed = 0 if out_path is not None: out = cv2.VideoWriter(out_path, cv2.VideoWriter_fourcc(*"mp4v"), target_fps, (frame_width, frame_height)) def cleanup_and_exit(): print() pool.terminate() vid.release() if out_path is not None: out.release() cv2.destroyAllWindows() exit() def get_next_frame(vid): frames = [] for idx in range(args.video_multiframe): frame = vid.read()[1] if frame is None: return frames frames.append(frame) return frames def transform_frame(frames): with torch.no_grad(): frames = [torch.from_numpy(frame).cuda().float() for frame in frames] return frames, transform(torch.stack(frames, 0)) def eval_network(inp): with torch.no_grad(): frames, imgs = inp num_extra = 0 while imgs.size(0) < args.video_multiframe: imgs = torch.cat([imgs, imgs[0].unsqueeze(0)], dim=0) num_extra += 1 out = net(imgs) if num_extra > 0: out = out[:-num_extra] return frames, out def prep_frame(inp, fps_str): with torch.no_grad(): frame, preds = inp return prep_display(preds, frame, None, None, undo_transform=False, class_color=True, fps_str=fps_str) frame_buffer = Queue() video_fps = 0 # All this timing code to make sure that def play_video(): try: nonlocal frame_buffer, running, video_fps, is_webcam, num_frames, frames_displayed, vid_done video_frame_times = MovingAverage(100) frame_time_stabilizer = frame_time_target last_time = None stabilizer_step = 0.0005 progress_bar = ProgressBar(30, num_frames) while running: frame_time_start = time.time() if not frame_buffer.empty(): next_time = time.time() if last_time is not None: video_frame_times.add(next_time - last_time) video_fps = 1 / video_frame_times.get_avg() if out_path is None: cv2.imshow(path, frame_buffer.get()) else: out.write(frame_buffer.get()) frames_displayed += 1 last_time = next_time if out_path is not None: if video_frame_times.get_avg() == 0: fps = 0 else: fps = 1 / video_frame_times.get_avg() progress = frames_displayed / num_frames * 100 progress_bar.set_val(frames_displayed) print('\rProcessing Frames %s %6d / %6d (%5.2f%%) %5.2f fps ' % (repr(progress_bar), frames_displayed, num_frames, progress, fps), end='') # This is split because you don't want savevideo to require cv2 display functionality (see #197) if out_path is None and cv2.waitKey(1) == 27: # Press Escape to close running = False if not (frames_displayed < num_frames): running = False if not vid_done: buffer_size = frame_buffer.qsize() if buffer_size < args.video_multiframe: frame_time_stabilizer += stabilizer_step elif buffer_size > args.video_multiframe: frame_time_stabilizer -= stabilizer_step if frame_time_stabilizer < 0: frame_time_stabilizer = 0 new_target = frame_time_stabilizer if is_webcam else max(frame_time_stabilizer, frame_time_target) else: new_target = frame_time_target next_frame_target = max(2 * new_target - video_frame_times.get_avg(), 0) target_time = frame_time_start + next_frame_target - 0.001 # Let's just subtract a millisecond to be safe if out_path is None or args.emulate_playback: # This gives more accurate timing than if sleeping the whole amount at once while time.time() < target_time: time.sleep(0.001) else: # Let's not starve the main thread, now time.sleep(0.001) except: # See issue #197 for why this is necessary import traceback traceback.print_exc() extract_frame = lambda x, i: (x[0][i] if x[1][i]['detection'] is None else x[0][i].to(x[1][i]['detection']['box'].device), [x[1][i]]) # Prime the network on the first frame because I do some thread unsafe things otherwise print('Initializing model... ', end='') first_batch = eval_network(transform_frame(get_next_frame(vid))) print('Done.') # For each frame the sequence of functions it needs to go through to be processed (in reversed order) sequence = [prep_frame, eval_network, transform_frame] pool = ThreadPool(processes=len(sequence) + args.video_multiframe + 2) pool.apply_async(play_video) active_frames = [{'value': extract_frame(first_batch, i), 'idx': 0} for i in range(len(first_batch[0]))] print() if out_path is None: print('Press Escape to close.') try: while vid.isOpened() and running: # Hard limit on frames in buffer so we don't run out of memory >.> while frame_buffer.qsize() > 100: time.sleep(0.001) start_time = time.time() # Start loading the next frames from the disk if not vid_done: next_frames = pool.apply_async(get_next_frame, args=(vid,)) else: next_frames = None if not (vid_done and len(active_frames) == 0): # For each frame in our active processing queue, dispatch a job # for that frame using the current function in the sequence for frame in active_frames: _args = [frame['value']] if frame['idx'] == 0: _args.append(fps_str) frame['value'] = pool.apply_async(sequence[frame['idx']], args=_args) # For each frame whose job was the last in the sequence (i.e. for all final outputs) for frame in active_frames: if frame['idx'] == 0: frame_buffer.put(frame['value'].get()) # Remove the finished frames from the processing queue active_frames = [x for x in active_frames if x['idx'] > 0] # Finish evaluating every frame in the processing queue and advanced their position in the sequence for frame in list(reversed(active_frames)): frame['value'] = frame['value'].get() frame['idx'] -= 1 if frame['idx'] == 0: # Split this up into individual threads for prep_frame since it doesn't support batch size active_frames += [{'value': extract_frame(frame['value'], i), 'idx': 0} for i in range(1, len(frame['value'][0]))] frame['value'] = extract_frame(frame['value'], 0) # Finish loading in the next frames and add them to the processing queue if next_frames is not None: frames = next_frames.get() if len(frames) == 0: vid_done = True else: active_frames.append({'value': frames, 'idx': len(sequence)-1}) # Compute FPS frame_times.add(time.time() - start_time) fps = args.video_multiframe / frame_times.get_avg() else: fps = 0 fps_str = 'Processing FPS: %.2f | Video Playback FPS: %.2f | Frames in Buffer: %d' % (fps, video_fps, frame_buffer.qsize()) if not args.display_fps: print('\r' + fps_str + ' ', end='') except KeyboardInterrupt: print('\nStopping...') cleanup_and_exit() def evaluate(net:Yolact, dataset, train_mode=False): net.detect.use_fast_nms = args.fast_nms net.detect.use_cross_class_nms = args.cross_class_nms cfg.mask_proto_debug = args.mask_proto_debug # TODO Currently we do not support Fast Mask Re-scroing in evalimage, evalimages, and evalvideo if args.image is not None: if ':' in args.image: inp, out = args.image.split(':') evalimage(net, inp, out) else: evalimage(net, args.image) return elif args.images is not None: inp, out = args.images.split('E:/yolact-master/coco/images/train2017: E:/yolact-master/results/output') evalimages(net, inp, out) return elif args.video is not None: if ':' in args.video: inp, out = args.video.split(':') evalvideo(net, inp, out) else: evalvideo(net, args.video) return frame_times = MovingAverage() dataset_size = len(dataset) if args.max_images < 0 else min(args.max_images, len(dataset)) progress_bar = ProgressBar(30, dataset_size) print() if not args.display and not args.benchmark: # For each class and iou, stores tuples (score, isPositive) # Index ap_data[type][iouIdx][classIdx] ap_data = { 'box' : [[APDataObject() for _ in cfg.dataset.class_names] for _ in iou_thresholds], 'mask': [[APDataObject() for _ in cfg.dataset.class_names] for _ in iou_thresholds] } detections = Detections() else: timer.disable('Load Data') dataset_indices = list(range(len(dataset))) if args.shuffle: random.shuffle(dataset_indices) elif not args.no_sort: # Do a deterministic shuffle based on the image ids # # I do this because on python 3.5 dictionary key order is *random*, while in 3.6 it's # the order of insertion. That means on python 3.6, the images come in the order they are in # in the annotations file. For some reason, the first images in the annotations file are # the hardest. To combat this, I use a hard-coded hash function based on the image ids # to shuffle the indices we use. That way, no matter what python version or how pycocotools # handles the data, we get the same result every time. hashed = [badhash(x) for x in dataset.ids] dataset_indices.sort(key=lambda x: hashed[x]) dataset_indices = dataset_indices[:dataset_size] try: # Main eval loop for it, image_idx in enumerate(dataset_indices): timer.reset() with timer.env('Load Data'): img, gt, gt_masks, h, w, num_crowd = dataset.pull_item(image_idx) # Test flag, do not upvote if cfg.mask_proto_debug: with open('scripts/info.txt', 'w') as f: f.write(str(dataset.ids[image_idx])) np.save('scripts/gt.npy', gt_masks) batch = Variable(img.unsqueeze(0)) if args.cuda: batch = batch.cuda() with timer.env('Network Extra'): preds = net(batch) # Perform the meat of the operation here depending on our mode. if args.display: img_numpy = prep_display(preds, img, h, w) elif args.benchmark: prep_benchmark(preds, h, w) else: prep_metrics(ap_data, preds, img, gt, gt_masks, h, w, num_crowd, dataset.ids[image_idx], detections) # First couple of images take longer because we're constructing the graph. # Since that's technically initialization, don't include those in the FPS calculations. if it > 1: frame_times.add(timer.total_time()) if args.display: if it > 1: print('Avg FPS: %.4f' % (1 / frame_times.get_avg())) plt.imshow(img_numpy) plt.title(str(dataset.ids[image_idx])) plt.show() elif not args.no_bar: if it > 1: fps = 1 / frame_times.get_avg() else: fps = 0 progress = (it+1) / dataset_size * 100 progress_bar.set_val(it+1) print('\rProcessing Images %s %6d / %6d (%5.2f%%) %5.2f fps ' % (repr(progress_bar), it+1, dataset_size, progress, fps), end='') if not args.display and not args.benchmark: print() if args.output_coco_json: print('Dumping detections...') if args.output_web_json: detections.dump_web() else: detections.dump() else: if not train_mode: print('Saving data...') with open(args.ap_data_file, 'wb') as f: pickle.dump(ap_data, f) return calc_map(ap_data) elif args.benchmark: print() print() print('Stats for the last frame:') timer.print_stats() avg_seconds = frame_times.get_avg() print('Average: %5.2f fps, %5.2f ms' % (1 / frame_times.get_avg(), 1000*avg_seconds)) except KeyboardInterrupt: print('Stopping...') def calc_map(ap_data): print('Calculating mAP...') aps = [{'box': [], 'mask': []} for _ in iou_thresholds] for _class in range(len(cfg.dataset.class_names)): for iou_idx in range(len(iou_thresholds)): for iou_type in ('box', 'mask'): ap_obj = ap_data[iou_type][iou_idx][_class] if not ap_obj.is_empty(): aps[iou_idx][iou_type].append(ap_obj.get_ap()) all_maps = {'box': OrderedDict(), 'mask': OrderedDict()} # Looking back at it, this code is really hard to read :/ for iou_type in ('box', 'mask'): all_maps[iou_type]['all'] = 0 # Make this first in the ordereddict for i, threshold in enumerate(iou_thresholds): mAP = sum(aps[i][iou_type]) / len(aps[i][iou_type]) * 100 if len(aps[i][iou_type]) > 0 else 0 all_maps[iou_type][int(threshold*100)] = mAP all_maps[iou_type]['all'] = (sum(all_maps[iou_type].values()) / (len(all_maps[iou_type].values())-1)) print_maps(all_maps) # Put in a prettier format so we can serialize it to json during training all_maps = {k: {j: round(u, 2) for j, u in v.items()} for k, v in all_maps.items()} return all_maps def print_maps(all_maps): # Warning: hacky make_row = lambda vals: (' %5s |' * len(vals)) % tuple(vals) make_sep = lambda n: ('-------+' * n) print() print(make_row([''] + [('.%d ' % x if isinstance(x, int) else x + ' ') for x in all_maps['box'].keys()])) print(make_sep(len(all_maps['box']) + 1)) for iou_type in ('box', 'mask'): print(make_row([iou_type] + ['%.2f' % x if x < 100 else '%.1f' % x for x in all_maps[iou_type].values()])) print(make_sep(len(all_maps['box']) + 1)) print() if __name__ == '__main__': parse_args() if args.config is not None: set_cfg(args.config) if args.trained_model == 'interrupt': args.trained_model = SavePath.get_interrupt('weights/') elif args.trained_model == 'latest': args.trained_model = SavePath.get_latest('weights/', cfg.name) if args.config is None: model_path = SavePath.from_str(args.trained_model) # TODO: Bad practice? Probably want to do a name lookup instead. args.config = model_path.model_name + '_config' print('Config not specified. Parsed %s from the file name.\n' % args.config) set_cfg(args.config) if args.detect: cfg.eval_mask_branch = False if args.dataset is not None: set_dataset(args.dataset) with torch.no_grad(): if not os.path.exists('results'): os.makedirs('results') if args.cuda: cudnn.fastest = True torch.set_default_tensor_type('torch.cuda.FloatTensor') else: torch.set_default_tensor_type('torch.FloatTensor') if args.resume and not args.display: with open(args.ap_data_file, 'rb') as f: ap_data = pickle.load(f) calc_map(ap_data) exit() if args.image is None and args.video is None and args.images is None: dataset = COCODetection(cfg.dataset.valid_images, cfg.dataset.valid_info, transform=BaseTransform(), has_gt=cfg.dataset.has_gt) prep_coco_cats() else: dataset = None print('Loading model...', end='') net = Yolact() net.load_weights(args.trained_model) net.eval() print(' Done.') if args.cuda: net = net.cuda() evaluate(net, dataset) Traceback (most recent call last): File "eval.py", line 1105, in <module> evaluate(net, dataset) File "eval.py", line 884, in evaluate inp, out = args.images.split('E:/yolact-master/coco/images/train2017: E:/yolact-master/results/output') ValueError: not enough values to unpack (expected 2, got 1)

我用的是MobileNetv2 模型训练了一个番茄叶片分类模型,可以分类四种叶片但是精度不太好,如果我把叶片如果离镜头比较远或比较近的话,它都会识别不出来,非得用摄像头把整个叶片的边缘正好框住才可以识别出来。怎么可以让它对距离要求没那么苛刻就可以识别出来 应该怎么做,给我最切实可行的方法, 下面是我的训练代码文件,帮我修改代码,细致讲解,并添加详细注释,我是初学者,要让我看懂 """ @File : MobileNetTrain.py @Author : GiperHsiue @Time : 2023/5/29 18:18 """ import tensorflow as tf import matplotlib.pyplot as plt from time import * import os # 数据集加载函数,指明数据集的位置并统一处理为imgheight*imgwidth的大小,同时设置batch def data_load(data_dir, test_data_dir, img_height, img_width, batch_size): # 加载训练集 train_ds = tf.keras.preprocessing.image_dataset_from_directory( data_dir, label_mode='categorical', seed=123, image_size=(img_height, img_width), batch_size=batch_size) # 加载测试集 val_ds = tf.keras.preprocessing.image_dataset_from_directory( test_data_dir, label_mode='categorical', seed=123, image_size=(img_height, img_width), batch_size=batch_size) class_names = train_ds.class_names # 返回处理之后的训练集、验证集和类名 return train_ds, val_ds, class_names # 构建mobilenet模型 # 模型加载,指定图片处理的大小和是否进行迁移学习 def model_load(IMG_SHAPE=(128, 128, 3), class_num=3,alpha=0.35): # 微调的过程中不需要进行归一化的处理 # 加载预训练的mobilenet模型 base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE, include_top=False, weights='imagenet', alpha=alpha ) # 将模型的主干参数进行冻结 base_model.trainable = False model = tf.keras.models.Sequential([ # 进行归一化的处理 tf.keras.layers.experimental.preprocessing.Rescaling(1. / 127.5, offset=-1, input_shape=IMG_SHAPE), # 设置主干模型 base_model, # 对主干模型的输出进行全局平均池化 tf.keras.layers.GlobalAveragePooling2D(), # 通过全连接层映射到最后的分类数目上 tf.keras.layers.Dense(class_num, activation='softmax') ]) model.summary() # 模型训练的优化器为adam优化器,模型的损失函数为交叉熵损失函数 model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) return model # 展示训练过程的曲线 def showAccuracyAndLoss(history): # 从history中提取模型训练集和验证集准确率信息和误差信息 acc = history.history['accuracy'] val_acc = history.history['val_accuracy'] loss = history.history['loss'] val_loss = history.history['val_loss'] # 按照上下结构将图画输出 plt.figure(figsize=(8, 8)) plt.subplot(2, 1, 1) plt.plot(acc, label='Training Accuracy') plt.plot(val_acc, label='Validation Accuracy') plt.legend(loc='lower right') plt.ylabel('Accuracy') plt.ylim([min(plt.ylim()), 1]) plt.title('Training and Validation Accuracy') plt.subplot(2, 1, 2) plt.plot(loss, label='Training Loss') plt.plot(val_loss, label='Validation Loss') plt.legend(loc='upper right') plt.ylabel('Cross Entropy') plt.title('Training and Validation Loss') plt.xlabel('epoch') # plt.savefig('results/results_mobilenet.png', dpi=100) filename = 'results_mobilenet.png' index = 1 while os.path.isfile(os.path.join('resultsPng', filename)): filename = 'results_mobilenet' + str(index) + '.png' index += 1 filename = 'resultsPng/' + filename plt.savefig(filename, dpi=100) def train(epochs): # 开始训练 begin_time = time() train_ds, val_ds, class_names = data_load("G:/tomatoMobilenet/split_data/train", "G:/tomatoMobilenet/split_data/test", 128, 128, 16) print(class_names) # 加载模型 model = model_load(class_num=len(class_names)) # 指明训练的轮数epoch,开始训练 history = model.fit(train_ds, validation_data=val_ds, epochs=epochs) model.save("G:/tomatoMobilenet/TomatoRecognition/models/mobilenet_fv.h5") # 记录结束时间 end_time = time() run_time = end_time - begin_time print('该循环程序运行时间:', run_time, "s") # 该循环程序运行时间: 1.4201874732 showAccuracyAndLoss(history) if __name__ == '__main__': train(epochs=30)

import json import torch from typing import Dict, List, Optional, Tuple from torch.utils.data import Dataset from collections import defaultdict import transformers from peft import LoraConfig, TaskType, get_peft_model from torch.utils.data import DataLoader from transformers import Trainer, TrainingArguments from lora_plus import LoraPlusTrainer from swanlab.integration.transformers import SwanLabCallback import swanlab import numpy as np import pandas as pd import re from tqdm import tqdm from transformers import PreTrainedTokenizer, AutoTokenizer import torch.nn as nn from transformers import PreTrainedModel from torch.nn import CrossEntropyLoss, MSELoss # 分子公式解析函数 def parse_chem_formula(formula): pattern = r'([A-Z][a-z]?)(\d*)' matches = re.findall(pattern, formula) element_counts = defaultdict(int) for (element, count) in matches: count = int(count) if count else 1 element_counts[element] += count return element_counts def generate_element_list(formula): element_counts = parse_chem_formula(formula) elements = [] for element, count in element_counts.items(): if element != "H": elements.extend([element] * count) return ''.join(elements) # 初始化SwanLab swanlab.init("Finetune-Llama3.2-with-Encoder") swanlab_callback = SwanLabCallback( project="Finetune-Llama3.2-with-Encoder", experiment_name="Finetune-Llama3.2-with-Encoder" ) # 常量定义 CHEM_FORMULA_SIZE = r"([A-Z][a-z]*)([0-9]*)" VALID_ELEMENTS = ["C", "N", "P", "O", "S", "Si", "I", "H", "Cl", "F", "Br", "B", "Se", "Fe", "Co", "As", "K", "Na"] element_to_idx = {elem: idx for idx, elem in enumerate(VALID_ELEMENTS)} # 化学式转密集向量 def formula_to_dense(chem_formula: str) -> torch.Tensor: dense_vec = torch.zeros(len(VALID_ELEMENTS), dtype=torch.float32) matches = re.findall(CHEM_FORMULA_SIZE, chem_formula) for chem_symbol, num_str in matches: num = 1 if num_str == "" else int(num_str) if chem_symbol in element_to_idx: idx = element_to_idx[chem_symbol] dense_vec[idx] += num return dense_vec # 位置编码生成 def positional_encoding(max_position: int, d_model: int, min_freq: float = 1e-4) -> torch.Tensor: position = torch.arange(max_position).unsqueeze(1) div_term = torch.exp(torch.arange(0, d_model, 2) * (-torch.log(torch.tensor(min_freq)) / d_model)) pos_enc = torch.zeros(max_position, d_model) pos_enc[:, 0::2] = torch.sin(position * div_term) pos_enc[:, 1::2] = torch.cos(position * div_term) return pos_enc # 初始化位置编码矩阵 P = positional_encoding(2000000, 254) dimn = 254 # 与位置编码维度一致 # 质谱数据编码 def encode_spectra(rag_tensor: list, P: torch.Tensor, dimn: int) -> torch.Tensor: encoded_list = [] for sample in rag_tensor: mz_list, intensity_list = sample base_features = torch.tensor([mz_list, intensity_list], dtype=torch.float32).T pos_enc = torch.stack([P[min(int(mz), P.size(0)-1)] for mz in mz_list]) features = torch.cat([base_features, pos_enc], dim=1) if features.size(0) < 501: padding = torch.zeros(501 - features.size(0), features.size(1)) features = torch.cat([features, padding], dim=0) else: features = features[:501] encoded_list.append(features) return torch.stack(encoded_list) # 质谱数据预处理 def preprocess_spectra(df: pd.DataFrame) -> list: spectra_list = [] for idx, row in tqdm(df.iterrows(), total=len(df)): spectrum_str = row['Spectrum'] total_mass = row['Total Exact Mass'] pairs = spectrum_str.split() mz_list, intensity_list = [], [] for pair in pairs: mz, intensity = pair.split(':') mz_list.append(float(mz)) intensity_list.append(float(intensity)) mz_list.append(total_mass) intensity_list.append(0.0) mz_list = [round(mz, 2) for mz in mz_list] intensity_list = [round(intensity, 2) for intensity in intensity_list] spectra_list.append([mz_list, intensity_list]) return spectra_list class MolecularDataset(Dataset): def __init__(self, csv_path: str, tokenizer: AutoTokenizer, max_seq_len: int = 512): self.df = pd.read_csv(csv_path) self.tokenizer = tokenizer self.max_seq_len = max_seq_len self.pad_token_id = tokenizer.pad_token_id self.mask_token_id = tokenizer.mask_token_id if tokenizer.mask_token_id is not None else tokenizer.convert_tokens_to_ids("<mask>") spectra_data = preprocess_spectra(self.df) self.spec_encoded = encode_spectra(spectra_data, P, dimn) self.element_lists = [generate_element_list(formula) for formula in self.df['Molecular Formula']] self.element_lengths = [] for elem_list in self.element_lists: elem_tokens = self.tokenizer(elem_list, add_special_tokens=False)['input_ids'] self.element_lengths.append(len(elem_tokens)) def __len__(self): return len(self.df) def __getitem__(self, idx) -> dict: formula = self.df.iloc[idx]['Molecular Formula'] formula_vec = formula_to_dense(formula).squeeze(0) # 压缩为1D向量 spec_matrix = self.spec_encoded[idx] element_list = self.element_lists[idx] element_text = f"<|Spectrum|>{element_list}" selfies_str = self.df.iloc[idx]['SELFIES'] selfies_text = f"{selfies_str}" input_text = f"{element_text}{selfies_text}" encoding = self.tokenizer( input_text, add_special_tokens=False, padding='max_length', truncation=True, max_length=self.max_seq_len, return_tensors='pt' ) input_ids = encoding['input_ids'].squeeze(0) attention_mask = encoding['attention_mask'].squeeze(0) labels = input_ids.clone() labels[labels == self.pad_token_id] = -100 element_len = self.element_lengths[idx] element_end = 3 + element_len # , <|Spectrum|>, 元素列表 if element_end < len(labels): labels[:element_end] = -100 return { 'encoder1_inputs': formula_vec, # 注意:现在是1D向量 'encoder2_inputs': spec_matrix, 'input_ids': input_ids, 'attention_mask': attention_mask, 'labels': labels, 'formula_labels': formula_vec, # 添加元素计数标签 } # 加载tokenizer tokenizer = AutoTokenizer.from_pretrained('/root/workspace/d21lv5s7v38s73b4ddlg/checkpoint-2500') if tokenizer.mask_token is None: tokenizer.add_special_tokens({"mask_token": "<mask>"}) # 创建数据集 dataset = MolecularDataset('/root/workspace/d21lv5s7v38s73b4ddlg/SELFIES-SFT.csv', tokenizer) def custom_collator(features: List[Dict]) -> Dict: batch = { 'encoder1_inputs': torch.stack([f['encoder1_inputs'] for f in features]), # 形状: (batch_size, 18) 'encoder2_inputs': torch.stack([f['encoder2_inputs'] for f in features]), 'input_ids': torch.stack([f['input_ids'] for f in features]), 'attention_mask': torch.stack([f['attention_mask'] for f in features]), 'labels': torch.stack([f['labels'] for f in features]), 'formula_labels': torch.stack([f['formula_labels'] for f in features]), # 形状: (batch_size, 18) } return batch class ElementPredictionHead(nn.Module): """化学元素计数预测头部""" def __init__(self, hidden_size, output_size=18): super().__init__() self.dense = nn.Linear(hidden_size, hidden_size) self.activation = nn.ReLU() self.layer_norm = nn.LayerNorm(hidden_size) self.out_proj = nn.Linear(hidden_size, output_size) def forward(self, hidden_states): x = self.dense(hidden_states) x = self.activation(x) x = self.layer_norm(x) x = self.out_proj(x) return x class LlamaWithEncoder(PreTrainedModel): def __init__(self, base_model, encoder1_dim=18, encoder2_dim=256, hidden_dim=512): self.config = base_model.config super().__init__(self.config) self.model = base_model # 分子式编码器 encoder1_layer = nn.TransformerEncoderLayer( d_model=encoder1_dim, nhead=3, dim_feedforward=hidden_dim, batch_first=True ) self.encoder1 = nn.TransformerEncoder(encoder1_layer, num_layers=2) # 质谱编码器 encoder2_layer = nn.TransformerEncoderLayer( d_model=encoder2_dim, nhead=8, dim_feedforward=hidden_dim, batch_first=True ) self.encoder2 = nn.TransformerEncoder(encoder2_layer, num_layers=2) # 投影层 self.proj1 = nn.Linear(encoder1_dim, base_model.config.hidden_size) self.proj2 = nn.Linear(encoder2_dim, base_model.config.hidden_size) # 嵌入层 self.embed_tokens = nn.Embedding( num_embeddings=base_model.config.vocab_size, embedding_dim=base_model.config.hidden_size, padding_idx=base_model.config.pad_token_id ) self.embed_tokens.weight.data = base_model.get_input_embeddings().weight.data.clone() # 添加元素计数预测头 self.element_head = ElementPredictionHead(base_model.config.hidden_size) # PEFT所需方法 def get_input_embeddings(self): return self.embed_tokens def set_input_embeddings(self, value): self.embed_tokens = value def get_output_embeddings(self): return self.model.get_output_embeddings() def set_output_embeddings(self, new_embeddings): self.model.set_output_embeddings(new_embeddings) def get_base_model(self): return self.model def forward( self, input_ids: Optional[torch.LongTensor] = None, attention_mask: Optional[torch.FloatTensor] = None, encoder1_inputs: Optional[torch.FloatTensor] = None, encoder2_inputs: Optional[torch.FloatTensor] = None, labels: Optional[torch.LongTensor] = None, formula_labels: Optional[torch.FloatTensor] = None, # 新增:元素计数标签 past_key_values: Optional[Tuple[Tuple[torch.FloatTensor]]] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None, **kwargs ): return_dict = return_dict if return_dict is not None else self.config.use_return_dict # 1. 编码器处理 enc1_out = self.encoder1(encoder1_inputs.unsqueeze(1)) # 添加序列维度 enc1_out = enc1_out.mean(dim=1) # (batch_size, encoder1_dim) enc1_proj = self.proj1(enc1_out) # (batch_size, hidden_size) enc2_out = self.encoder2(encoder2_inputs) # (batch_size, 501, encoder2_dim) enc2_out = enc2_out.mean(dim=1) # (batch_size, encoder2_dim) enc2_proj = self.proj2(enc2_out) # (batch_size, hidden_size) # 合并编码器输出 mask_replacement = (enc1_proj + enc2_proj) / 2 # (batch_size, hidden_size) # 2. 获取原始嵌入 embeddings = self.embed_tokens(input_ids) # (batch_size, seq_len, hidden_size) batch_size, seq_len, hidden_size = embeddings.size() # 3. 替换<mask> token if seq_len > 2: mask_embed = mask_replacement.unsqueeze(1) # (batch_size, 1, hidden_size) part1 = embeddings[:, :2, :] # (batch_size, 2, hidden_size) part2 = mask_embed # (batch_size, 1, hidden_size) part3 = embeddings[:, 3:, :] # (batch_size, seq_len-3, hidden_size) new_embeddings = torch.cat([part1, part2, part3], dim=1) # (batch_size, seq_len, hidden_size) else: new_embeddings = embeddings # 4. 调用基础模型 model_output = self.model( inputs_embeds=new_embeddings, attention_mask=attention_mask, labels=labels, past_key_values=past_key_values, output_attentions=output_attentions, output_hidden_states=True, # 必须返回隐藏状态用于元素预测 return_dict=return_dict, **kwargs ) # 5. 元素计数预测 element_pred = None element_loss = None if formula_labels is not None: # 获取最后一个非填充token的隐藏状态 seq_lengths = attention_mask.sum(dim=1) - 1 # 最后一个有效token的索引 batch_indices = torch.arange(batch_size, device=model_output.hidden_states[-1].device) last_token_hidden = model_output.hidden_states[-1][batch_indices, seq_lengths] # (batch_size, hidden_size) # 预测元素计数 element_pred = self.element_head(last_token_hidden) # (batch_size, 18) # 计算元素计数损失(MSE损失) element_loss = MSELoss()(element_pred, formula_labels) # 组合总损失:语言模型损失 + 元素计数损失 total_loss = model_output.loss + 0.5 * element_loss else: total_loss = model_output.loss # 返回结果 if not return_dict: output = (model_output.logits,) if element_pred is not None: output += (element_pred,) return (total_loss,) + output if total_loss is not None else output return { 'loss': total_loss, 'logits': model_output.logits, 'element_pred': element_pred, 'element_loss': element_loss, 'hidden_states': model_output.hidden_states, 'past_key_values': model_output.past_key_values, 'attentions': model_output.attentions } # 加载预训练模型 base_model = transformers.AutoModelForCausalLM.from_pretrained( "/root/workspace/d21lv5s7v38s73b4ddlg/checkpoint-2500", trust_remote_code=True, torch_dtype=torch.bfloat16, ) model = LlamaWithEncoder(base_model) # 配置LoRA lora_config = LoraConfig( r=8, lora_alpha=16, target_modules="all-linear", lora_dropout=0.0, bias="none", task_type="CAUSAL_LM" ) model = get_peft_model(model, lora_config) model.print_trainable_parameters() # 训练参数 training_args = TrainingArguments( output_dir="./llama3.2-SELFIES-SFT", per_device_train_batch_size=24, gradient_accumulation_steps=24, num_train_epochs=12, learning_rate=5.0e-05, optim="adamw_torch", logging_steps=10, bf16=True, save_strategy="steps", lr_scheduler_type='cosine', max_grad_norm=1.0, save_steps=2000, warmup_steps=0 ) class CustomTrainer(LoraPlusTrainer): def get_train_dataloader(self) -> DataLoader: return DataLoader( self.train_dataset, batch_size=self.args.train_batch_size, shuffle=True, collate_fn=self.data_collator, drop_last=False, ) # 训练模型 lp_trainer = CustomTrainer( model, training_args, train_dataset=dataset, tokenizer=tokenizer, data_collator=custom_collator, callbacks=[swanlab_callback], ) lp_trainer.train() lp_trainer.save_model(output_dir='./llama3.2-SELFIES-SFT') # 合并LoRA权重并移除元素预测头 model = model.merge_and_unload() model.element_head = None # 移除元素预测头 # 保存模型(不包括元素预测头) save_directory = './llama3.2-SELFIES' model.save_pretrained(save_directory, safe_serialization=True) tokenizer.save_pretrained(save_directory)不对,要对应修改为 element_text = f"<|User|><mask>{element_list}" # SELFIES目标序列并添加标记 selfies_str = self.df.iloc[idx]['SELFIES'] selfies_text = f"<|Assistant|>{selfies_str}",同时化学元素计数预测模型的输入token取<|Assistant|>token之后的,写出完整的修改代码

import py5 import random import math from shapely.geometry import Polygon import geopandas as gpd import momepy class Building: # 原有Building类保持不变 def __init__(self, x, y, length, width, angle): self.x = x self.y = y self.length = length self.width = width self.angle = angle self.update_polygon() def update_polygon(self): """更新建筑的几何形状(考虑旋转)""" half_l = self.length / 2 half_w = self.width / 2 corners = [ (-half_l, -half_w), (half_l, -half_w), (half_l, half_w), (-half_l, half_w) ] rotated = [] cos_a = math.cos(self.angle) sin_a = math.sin(self.angle) for x, y in corners: # 应用旋转矩阵 rx = x * cos_a - y * sin_a ry = x * sin_a + y * cos_a rotated.append((rx + self.x, ry + self.y)) self.polygon = Polygon(rotated) def draw(self): """绘制建筑""" py5.push_matrix() py5.translate(self.x, self.y) py5.rotate(self.angle) py5.fill(128) py5.rect(0, 0, self.length, self.width) py5.pop_matrix() buildings = [] tessellation_gdf = None # 存储泰森多边形 def setup(): py5.size(800, 800) py5.rect_mode(py5.CENTER) generate_buildings() def generate_buildings(): global buildings, tessellation_gdf # ... 原有建筑生成代码 ... # 生成地理数据框架 gdf = gpd.GeoDataFrame( geometry=[b.polygon for b in buildings], crs="EPSG:3857" # 根据实际坐标系统调整 ) # 生成形态学镶嵌 tessellation = momepy.morphological_tessellation( gdf, threshold=80, # 调整这个阈值以控制镶嵌范围 segmentize=0.5, # 线段分割精度 buffer_dist=5 # 缓冲距离 ).tessellation tessellation_gdf = tessellation.to_crs(gdf.crs) # 确保坐标系统一致 def draw(): py5.background(255) # 绘制泰森多边形 if tessellation_gdf is not None: py5.no_fill() py5.stroke(200, 100, 0, 150) # 半透明橙色 for idx, row in tessellation_gdf.iterrows(): geom = row.geometry if geom.geom_type == 'Polygon': py5.begin_shape() for x, y in geom.exterior.coords: py5.vertex(x, y) py5.end_shape(py5.CLOSE) # 绘制建筑 for b in buildings: b.draw() def mouse_clicked(): if py5.mouse_button == py5.LEFT: # ... 原有角度更新代码 ... # 更新泰森多边形 global tessellation_gdf gdf = gpd.GeoDataFrame( geometry=[b.polygon for b in buildings], crs="EPSG:3857" ) tessellation = momepy.morphological_tessellation( gdf, threshold=80, segmentize=0.5, buffer_dist=5 ).tessellation tessellation_gdf = tessellation.to_crs(gdf.crs) py5.redraw() py5.run_sketch()代码报错 File "C:\Python project\3-19\csdn2.py", line 54, in _py5_faux_setup 51 def setup(): 52 py5.size(800, 800) 53 py5.rect_mode(py5.CENTER) --> 54 generate_buildings() File "C:\Python project\3-19\csdn2.py", line 68, in generate_buildings 57 def generate_buildings(): (...) 64 crs="EPSG:3857" # 根据实际坐标系统调整 65 ) 66 67 # 生成形态学镶嵌 --> 68 tessellation = momepy.morphological_tessellation( 69 gdf, .................................................. momepy.morphological_tessellation = <function 'morphological_tessellation' _elements.py:30> .................................................. TypeError: morphological_tessellation() got an unexpected keyword argument 'threshold' forrtl: error (200): program aborting due to control-C event Stack trace terminated abnormally.

class Detect(nn.Module): stride = None # strides computed during build onnx_dynamic = False # ONNX export parameter def __init__(self, nc=80, anchors=(), ch=(), inplace=True): # detection layer super(Detect, self).__init__() self.nc = nc # number of classes self.no = nc + 5 # number of outputs per anchor self.nl = len(anchors) # number of detection layers self.na = len(anchors[0]) // 2 # number of anchors self.grid = [torch.zeros(1)] * self.nl # init grid a = torch.tensor(anchors).float().view(self.nl, -1, 2) self.register_buffer('anchors', a) # shape(nl,na,2) self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2) self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch) # output conv self.inplace = inplace # use in-place ops (e.g. slice assignment) def forward(self, x): # x = x.copy() # for profiling z = [] # inference output for i in range(self.nl): x[i] = self.m[i](x[i]) # conv bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85) x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() if not self.training: # inference if self.grid[i].shape[2:4] != x[i].shape[2:4] or self.onnx_dynamic: self.grid[i] = self._make_grid(nx, ny).to(x[i].device) y = x[i].sigmoid() if self.inplace: y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i] # xy y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh else: # for YOLOv5 on AWS Inferentia https://github.com/ultralytics/yolov5/pull/2953 xy = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i] # xy wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i].view(1, self.na, 1, 1, 2) # wh y = torch.cat((xy, wh, y[..., 4:]), -1) z.append(y.view(bs, -1, self.no)) return x if self.training else (torch.cat(z, 1), x) @staticmethod def _make_grid(nx=20, ny=20): yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)]) return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()解释每段代码的意思

import tkinter as tk from tkinter import filedialog, messagebox from PIL import Image, ImageTk import pytesseract # 配置Tesseract路径(Windows需要,根据实际情况修改) pytesseract.pytesseract.tesseract_cmd = r'D:\Tesseract\tesseract.exe' class OCRApp: def __init__(self, master): self.master = master master.title("OCR图像识别工具") # 创建UI组件 self.create_widgets() self.image_path = None def create_widgets(self): # 图片显示区域 self.image_label = tk.Label(self.master, borderwidth=2, relief="groove") self.image_label.pack(pady=10, padx=10, fill=tk.BOTH, expand=True) # 按钮区域 button_frame = tk.Frame(self.master) button_frame.pack(pady=5) # 上传按钮 self.upload_btn = tk.Button( button_frame, text="上传图片", command=self.upload_image, width=15 ) self.upload_btn.pack(side=tk.LEFT, padx=5) # 识别按钮 self.ocr_btn = tk.Button( button_frame, text="识别文字", command=self.perform_ocr, width=15 ) self.ocr_btn.pack(side=tk.LEFT, padx=5) # 结果展示区域 self.result_text = tk.Text( self.master, height=10, wrap=tk.WORD, font=("Arial", 10) ) self.result_text.pack(pady=10, padx=10, fill=tk.BOTH, expand=True) def upload_image(self): file_path = filedialog.askopenfilename( filetypes=[("图片文件", "*.png;*.jpg;*.jpeg;*.bmp")] ) if file_path: self.image_path = file_path self.show_image(file_path) def show_image(self, path): try: image = Image.open(path) # 调整图片尺寸以适应界面 max_size = (800, 600) image.thumbnail(max_size) photo = ImageTk.PhotoImage(image) self.image_label.config(image=photo) self.image_label.image = photo # 保持引用 except Exception as e: messagebox.showerror("错误", f"加载图片失败: {str(e)}") def perform_ocr(self): if not self.image_path: messagebox.showwarning("警告", "请先上传图片") return try: # 使用PIL打开图片 image = Image.open(self.image_path) # 进行OCR识别 text = pytesseract.image_to_string( image, lang='eng', # 使用英文语言包 config='--psm 6' # 识别单行文本 ) # 显示结果 self.result_text.delete(1.0, tk.END) self.result_text.insert(tk.END, text) except Exception as e: messagebox.showerror("识别错误", f"OCR处理失败: {str(e)}") if __name__ == "__main__": root = tk.Tk() app = OCRApp(root) root.geometry("800x600") root.mainloop() 问题:为什么识别成功概率不高?返回给我正确的代码

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license import contextlib import pickle import re import types from copy import deepcopy from pathlib import Path from .AddModules import * import thop import torch import torch.nn as nn from ultralytics.nn.modules import ( AIFI, C1, C2, C2PSA, C3, C3TR, ELAN1, OBB, PSA, SPP, SPPELAN, SPPF, AConv, ADown, Bottleneck, BottleneckCSP, C2f, C2fAttn, C2fCIB, C2fPSA, C3Ghost, C3k2, C3x, CBFuse, CBLinear, Classify, Concat, Conv, Conv2, ConvTranspose, Detect, DWConv, DWConvTranspose2d, Focus, GhostBottleneck, GhostConv, HGBlock, HGStem, ImagePoolingAttn, Index, Pose, RepC3, RepConv, RepNCSPELAN4, RepVGGDW, ResNetLayer, RTDETRDecoder, SCDown, Segment, TorchVision, WorldDetect, v10Detect, A2C2f, ) from ultralytics.utils import DEFAULT_CFG_DICT, DEFAULT_CFG_KEYS, LOGGER, colorstr, emojis, yaml_load from ultralytics.utils.checks import check_requirements, check_suffix, check_yaml from ultralytics.utils.loss import ( E2EDetectLoss, v8ClassificationLoss, v8DetectionLoss, v8OBBLoss, v8PoseLoss, v8SegmentationLoss, ) from ultralytics.utils.ops import make_divisible from ultralytics.utils.plotting import feature_visualization from ultralytics.utils.torch_utils import ( fuse_conv_and_bn, fuse_deconv_and_bn, initialize_weights, intersect_dicts, model_info, scale_img, time_sync, ) from .AddModules import * class BaseModel(nn.Module): """The BaseModel class serves as a base class for all the models in the Ultralytics YOLO family.""" def forward(self, x, *args, **kwargs): """ Perform forward pass of the model for either training or inference. If x is a dict, calculates and returns the loss for training. Otherwise, returns predictions for inference. Args: x (torch.Tensor | dict): Input tensor for inference, or dict with image tensor and labels for training. *args (Any): Variable length argument list. **kwargs (Any): Arbitrary keyword arguments. Returns: (torch.Tensor): Loss if x is a dict (training), or network predictions (inference). """ if isinstance(x, dict): # for cases of training and validating while training. return self.loss(x, *args, **kwargs) return self.predict(x, *args, **kwargs) def predict(self, x, profile=False, visualize=False, augment=False, embed=None): """ Perform a forward pass through the network. Args: x (torch.Tensor): The input tensor to the model. profile (bool): Print the computation time of each layer if True, defaults to False. visualize (bool): Save the feature maps of the model if True, defaults to False. augment (bool): Augment image during prediction, defaults to False. embed (list, optional): A list of feature vectors/embeddings to return. Returns: (torch.Tensor): The last output of the model. """ if augment: return self._predict_augment(x) return self._predict_once(x, profile, visualize, embed) def _predict_once(self, x, profile=False, visualize=False, embed=None): y, dt, embeddings = [], [], [] # outputs for m in self.model: if m.f != -1: # if not from previous layer x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers if profile: self._profile_one_layer(m, x, dt) if hasattr(m, 'backbone'): x = m(x) if len(x) != 5: # 0 - 5 x.insert(0, None) for index, i in enumerate(x): if index in self.save: y.append(i) else: y.append(None) x = x[-1] # 最后一个输出传给下一层 else: x = m(x) # run y.append(x if m.i in self.save else None) # save output if visualize: feature_visualization(x, m.type, m.i, save_dir=visualize) if embed and m.i in embed: embeddings.append(nn.functional.adaptive_avg_pool2d(x, (1, 1)).squeeze(-1).squeeze(-1)) # flatten if m.i == max(embed): return torch.unbind(torch.cat(embeddings, 1), dim=0) return x def _predict_augment(self, x): """Perform augmentations on input image x and return augmented inference.""" LOGGER.warning( f"WARNING ⚠️ {self.__class__.__name__} does not support 'augment=True' prediction. " f"Reverting to single-scale prediction." ) return self._predict_once(x) def _profile_one_layer(self, m, x, dt): """ Profile the computation time and FLOPs of a single layer of the model on a given input. Appends the results to the provided list. Args: m (nn.Module): The layer to be profiled. x (torch.Tensor): The input data to the layer. dt (list): A list to store the computation time of the layer. Returns: None """ c = m == self.model[-1] and isinstance(x, list) # is final layer list, copy input as inplace fix flops = thop.profile(m, inputs=[x.copy() if c else x], verbose=False)[0] / 1e9 * 2 if thop else 0 # GFLOPs t = time_sync() for _ in range(10): m(x.copy() if c else x) dt.append((time_sync() - t) * 100) if m == self.model[0]: LOGGER.info(f"{'time (ms)':>10s} {'GFLOPs':>10s} {'params':>10s} module") LOGGER.info(f"{dt[-1]:10.2f} {flops:10.2f} {m.np:10.0f} {m.type}") if c: LOGGER.info(f"{sum(dt):10.2f} {'-':>10s} {'-':>10s} Total") def fuse(self, verbose=True): """ Fuse the Conv2d() and BatchNorm2d() layers of the model into a single layer, in order to improve the computation efficiency. Returns: (nn.Module): The fused model is returned. """ if not self.is_fused(): for m in self.model.modules(): if isinstance(m, (Conv, Conv2, DWConv)) and hasattr(m, "bn"): if isinstance(m, Conv2): m.fuse_convs() m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv delattr(m, "bn") # remove batchnorm m.forward = m.forward_fuse # update forward if isinstance(m, ConvTranspose) and hasattr(m, "bn"): m.conv_transpose = fuse_deconv_and_bn(m.conv_transpose, m.bn) delattr(m, "bn") # remove batchnorm m.forward = m.forward_fuse # update forward if isinstance(m, RepConv): m.fuse_convs() m.forward = m.forward_fuse # update forward if isinstance(m, RepVGGDW): m.fuse() m.forward = m.forward_fuse self.info(verbose=verbose) return self def is_fused(self, thresh=10): """ Check if the model has less than a certain threshold of BatchNorm layers. Args: thresh (int, optional): The threshold number of BatchNorm layers. Default is 10. Returns: (bool): True if the number of BatchNorm layers in the model is less than the threshold, False otherwise. """ bn = tuple(v for k, v in nn.__dict__.items() if "Norm" in k) # normalization layers, i.e. BatchNorm2d() return sum(isinstance(v, bn) for v in self.modules()) < thresh # True if < 'thresh' BatchNorm layers in model def info(self, detailed=False, verbose=True, imgsz=640): """ Prints model information. Args: detailed (bool): if True, prints out detailed information about the model. Defaults to False verbose (bool): if True, prints out the model information. Defaults to False imgsz (int): the size of the image that the model will be trained on. Defaults to 640 """ return model_info(self, detailed=detailed, verbose=verbose, imgsz=imgsz) def _apply(self, fn): """ Applies a function to all the tensors in the model that are not parameters or registered buffers. Args: fn (function): the function to apply to the model Returns: (BaseModel): An updated BaseModel object. """ self = super()._apply(fn) m = self.model[-1] # Detect() if isinstance(m, Detect): # includes all Detect subclasses like Segment, Pose, OBB, WorldDetect m.stride = fn(m.stride) m.anchors = fn(m.anchors) m.strides = fn(m.strides) return self def load(self, weights, verbose=True): """ Load the weights into the model. Args: weights (dict | torch.nn.Module): The pre-trained weights to be loaded. verbose (bool, optional): Whether to log the transfer progress. Defaults to True. """ model = weights["model"] if isinstance(weights, dict) else weights # torchvision models are not dicts csd = model.float().state_dict() # checkpoint state_dict as FP32 csd = intersect_dicts(csd, self.state_dict()) # intersect self.load_state_dict(csd, strict=False) # load if verbose: LOGGER.info(f"Transferred {len(csd)}/{len(self.model.state_dict())} items from pretrained weights") def loss(self, batch, preds=None): """ Compute loss. Args: batch (dict): Batch to compute loss on preds (torch.Tensor | List[torch.Tensor]): Predictions. """ if getattr(self, "criterion", None) is None: self.criterion = self.init_criterion() preds = self.forward(batch["img"]) if preds is None else preds return self.criterion(preds, batch) def init_criterion(self): """Initialize the loss criterion for the BaseModel.""" raise NotImplementedError("compute_loss() needs to be implemented by task heads") class DetectionModel(BaseModel): """YOLOv8 detection model.""" def __init__(self, cfg="yolov8n.yaml", ch=3, nc=None, verbose=True): # model, input channels, number of classes """Initialize the YOLOv8 detection model with the given config and parameters.""" super().__init__() self.yaml = cfg if isinstance(cfg, dict) else yaml_model_load(cfg) # cfg dict if self.yaml["backbone"][0][2] == "Silence": LOGGER.warning( "WARNING ⚠️ YOLOv9 Silence module is deprecated in favor of nn.Identity. " "Please delete local *.pt file and re-download the latest model checkpoint." ) self.yaml["backbone"][0][2] = "nn.Identity" # Define model ch = self.yaml["ch"] = self.yaml.get("ch", ch) # input channels if nc and nc != self.yaml["nc"]: LOGGER.info(f"Overriding model.yaml nc={self.yaml['nc']} with nc={nc}") self.yaml["nc"] = nc # override YAML value self.model, self.save = parse_model(deepcopy(self.yaml), ch=ch, verbose=verbose) # model, savelist self.names = {i: f"{i}" for i in range(self.yaml["nc"])} # default names dict self.inplace = self.yaml.get("inplace", True) self.end2end = getattr(self.model[-1], "end2end", False) # Build strides m = self.model[-1] # Detect() if isinstance(m, Detect): # includes all Detect subclasses like Segment, Pose, OBB, WorldDetect s = 256 # 2x min stride m.inplace = self.inplace def _forward(x): """Performs a forward pass through the model, handling different Detect subclass types accordingly.""" if self.end2end: return self.forward(x)["one2many"] return self.forward(x)[0] if isinstance(m, (Segment, Pose, OBB)) else self.forward(x) m.stride = torch.tensor([s / x.shape[-2] for x in _forward(torch.zeros(1, ch, s, s))]) # forward self.stride = m.stride m.bias_init() # only run once else: self.stride = torch.Tensor([32]) # default stride for i.e. RTDETR # Init weights, biases initialize_weights(self) if verbose: self.info() LOGGER.info("") def _predict_augment(self, x): """Perform augmentations on input image x and return augmented inference and train outputs.""" if getattr(self, "end2end", False) or self.__class__.__name__ != "DetectionModel": LOGGER.warning("WARNING ⚠️ Model does not support 'augment=True', reverting to single-scale prediction.") return self._predict_once(x) img_size = x.shape[-2:] # height, width s = [1, 0.83, 0.67] # scales f = [None, 3, None] # flips (2-ud, 3-lr) y = [] # outputs for si, fi in zip(s, f): xi = scale_img(x.flip(fi) if fi else x, si, gs=int(self.stride.max())) yi = super().predict(xi)[0] # forward yi = self._descale_pred(yi, fi, si, img_size) y.append(yi) y = self._clip_augmented(y) # clip augmented tails return torch.cat(y, -1), None # augmented inference, train @staticmethod def _descale_pred(p, flips, scale, img_size, dim=1): """De-scale predictions following augmented inference (inverse operation).""" p[:, :4] /= scale # de-scale x, y, wh, cls = p.split((1, 1, 2, p.shape[dim] - 4), dim) if flips == 2: y = img_size[0] - y # de-flip ud elif flips == 3: x = img_size[1] - x # de-flip lr return torch.cat((x, y, wh, cls), dim) def _clip_augmented(self, y): """Clip YOLO augmented inference tails.""" nl = self.model[-1].nl # number of detection layers (P3-P5) g = sum(4**x for x in range(nl)) # grid points e = 1 # exclude layer count i = (y[0].shape[-1] // g) * sum(4**x for x in range(e)) # indices y[0] = y[0][..., :-i] # large i = (y[-1].shape[-1] // g) * sum(4 ** (nl - 1 - x) for x in range(e)) # indices y[-1] = y[-1][..., i:] # small return y def init_criterion(self): """Initialize the loss criterion for the DetectionModel.""" return E2EDetectLoss(self) if getattr(self, "end2end", False) else v8DetectionLoss(self) class OBBModel(DetectionModel): """YOLOv8 Oriented Bounding Box (OBB) model.""" def __init__(self, cfg="yolov8n-obb.yaml", ch=3, nc=None, verbose=True): """Initialize YOLOv8 OBB model with given config and parameters.""" super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose) def init_criterion(self): """Initialize the loss criterion for the model.""" return v8OBBLoss(self) class SegmentationModel(DetectionModel): """YOLOv8 segmentation model.""" def __init__(self, cfg="yolov8n-seg.yaml", ch=3, nc=None, verbose=True): """Initialize YOLOv8 segmentation model with given config and parameters.""" super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose) def init_criterion(self): """Initialize the loss criterion for the SegmentationModel.""" return v8SegmentationLoss(self) class PoseModel(DetectionModel): """YOLOv8 pose model.""" def __init__(self, cfg="yolov8n-pose.yaml", ch=3, nc=None, data_kpt_shape=(None, None), verbose=True): """Initialize YOLOv8 Pose model.""" if not isinstance(cfg, dict): cfg = yaml_model_load(cfg) # load model YAML if any(data_kpt_shape) and list(data_kpt_shape) != list(cfg["kpt_shape"]): LOGGER.info(f"Overriding model.yaml kpt_shape={cfg['kpt_shape']} with kpt_shape={data_kpt_shape}") cfg["kpt_shape"] = data_kpt_shape super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose) def init_criterion(self): """Initialize the loss criterion for the PoseModel.""" return v8PoseLoss(self) class ClassificationModel(BaseModel): """YOLOv8 classification model.""" def __init__(self, cfg="yolov8n-cls.yaml", ch=3, nc=None, verbose=True): """Init ClassificationModel with YAML, channels, number of classes, verbose flag.""" super().__init__() self._from_yaml(cfg, ch, nc, verbose) def _from_yaml(self, cfg, ch, nc, verbose): """Set YOLOv8 model configurations and define the model architecture.""" self.yaml = cfg if isinstance(cfg, dict) else yaml_model_load(cfg) # cfg dict # Define model ch = self.yaml["ch"] = self.yaml.get("ch", ch) # input channels if nc and nc != self.yaml["nc"]: LOGGER.info(f"Overriding model.yaml nc={self.yaml['nc']} with nc={nc}") self.yaml["nc"] = nc # override YAML value elif not nc and not self.yaml.get("nc", None): raise ValueError("nc not specified. Must specify nc in model.yaml or function arguments.") self.model, self.save = parse_model(deepcopy(self.yaml), ch=ch, verbose=verbose) # model, savelist self.stride = torch.Tensor([1]) # no stride constraints self.names = {i: f"{i}" for i in range(self.yaml["nc"])} # default names dict self.info() @staticmethod def reshape_outputs(model, nc): """Update a TorchVision classification model to class count 'n' if required.""" name, m = list((model.model if hasattr(model, "model") else model).named_children())[-1] # last module if isinstance(m, Classify): # YOLO Classify() head if m.linear.out_features != nc: m.linear = nn.Linear(m.linear.in_features, nc) elif isinstance(m, nn.Linear): # ResNet, EfficientNet if m.out_features != nc: setattr(model, name, nn.Linear(m.in_features, nc)) elif isinstance(m, nn.Sequential): types = [type(x) for x in m] if nn.Linear in types: i = len(types) - 1 - types[::-1].index(nn.Linear) # last nn.Linear index if m[i].out_features != nc: m[i] = nn.Linear(m[i].in_features, nc) elif nn.Conv2d in types: i = len(types) - 1 - types[::-1].index(nn.Conv2d) # last nn.Conv2d index if m[i].out_channels != nc: m[i] = nn.Conv2d(m[i].in_channels, nc, m[i].kernel_size, m[i].stride, bias=m[i].bias is not None) def init_criterion(self): """Initialize the loss criterion for the ClassificationModel.""" return v8ClassificationLoss() class RTDETRDetectionModel(DetectionModel): """ RTDETR (Real-time DEtection and Tracking using Transformers) Detection Model class. This class is responsible for constructing the RTDETR architecture, defining loss functions, and facilitating both the training and inference processes. RTDETR is an object detection and tracking model that extends from the DetectionModel base class. Attributes: cfg (str): The configuration file path or preset string. Default is 'rtdetr-l.yaml'. ch (int): Number of input channels. Default is 3 (RGB). nc (int, optional): Number of classes for object detection. Default is None. verbose (bool): Specifies if summary statistics are shown during initialization. Default is True. Methods: init_criterion: Initializes the criterion used for loss calculation. loss: Computes and returns the loss during training. predict: Performs a forward pass through the network and returns the output. """ def __init__(self, cfg="rtdetr-l.yaml", ch=3, nc=None, verbose=True): """ Initialize the RTDETRDetectionModel. Args: cfg (str): Configuration file name or path. ch (int): Number of input channels. nc (int, optional): Number of classes. Defaults to None. verbose (bool, optional): Print additional information during initialization. Defaults to True. """ super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose) def init_criterion(self): """Initialize the loss criterion for the RTDETRDetectionModel.""" from ultralytics.models.utils.loss import RTDETRDetectionLoss return RTDETRDetectionLoss(nc=self.nc, use_vfl=True) def loss(self, batch, preds=None): """ Compute the loss for the given batch of data. Args: batch (dict): Dictionary containing image and label data. preds (torch.Tensor, optional): Precomputed model predictions. Defaults to None. Returns: (tuple): A tuple containing the total loss and main three losses in a tensor. """ if not hasattr(self, "criterion"): self.criterion = self.init_criterion() img = batch["img"] # NOTE: preprocess gt_bbox and gt_labels to list. bs = len(img) batch_idx = batch["batch_idx"] gt_groups = [(batch_idx == i).sum().item() for i in range(bs)] targets = { "cls": batch["cls"].to(img.device, dtype=torch.long).view(-1), "bboxes": batch["bboxes"].to(device=img.device), "batch_idx": batch_idx.to(img.device, dtype=torch.long).view(-1), "gt_groups": gt_groups, } preds = self.predict(img, batch=targets) if preds is None else preds dec_bboxes, dec_scores, enc_bboxes, enc_scores, dn_meta = preds if self.training else preds[1] if dn_meta is None: dn_bboxes, dn_scores = None, None else: dn_bboxes, dec_bboxes = torch.split(dec_bboxes, dn_meta["dn_num_split"], dim=2) dn_scores, dec_scores = torch.split(dec_scores, dn_meta["dn_num_split"], dim=2) dec_bboxes = torch.cat([enc_bboxes.unsqueeze(0), dec_bboxes]) # (7, bs, 300, 4) dec_scores = torch.cat([enc_scores.unsqueeze(0), dec_scores]) loss = self.criterion( (dec_bboxes, dec_scores), targets, dn_bboxes=dn_bboxes, dn_scores=dn_scores, dn_meta=dn_meta ) # NOTE: There are like 12 losses in RTDETR, backward with all losses but only show the main three losses. return sum(loss.values()), torch.as_tensor( [loss[k].detach() for k in ["loss_giou", "loss_class", "loss_bbox"]], device=img.device ) def predict(self, x, profile=False, visualize=False, batch=None, augment=False, embed=None): """ Perform a forward pass through the model. Args: x (torch.Tensor): The input tensor. profile (bool, optional): If True, profile the computation time for each layer. Defaults to False. visualize (bool, optional): If True, save feature maps for visualization. Defaults to False. batch (dict, optional): Ground truth data for evaluation. Defaults to None. augment (bool, optional): If True, perform data augmentation during inference. Defaults to False. embed (list, optional): A list of feature vectors/embeddings to return. Returns: (torch.Tensor): Model's output tensor. """ y, dt, embeddings = [], [], [] # outputs for m in self.model[:-1]: # except the head part if m.f != -1: # if not from previous layer x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers if profile: self._profile_one_layer(m, x, dt) x = m(x) # run y.append(x if m.i in self.save else None) # save output if visualize: feature_visualization(x, m.type, m.i, save_dir=visualize) if embed and m.i in embed: embeddings.append(nn.functional.adaptive_avg_pool2d(x, (1, 1)).squeeze(-1).squeeze(-1)) # flatten if m.i == max(embed): return torch.unbind(torch.cat(embeddings, 1), dim=0) head = self.model[-1] x = head([y[j] for j in head.f], batch) # head inference return x class WorldModel(DetectionModel): """YOLOv8 World Model.""" def __init__(self, cfg="yolov8s-world.yaml", ch=3, nc=None, verbose=True): """Initialize YOLOv8 world model with given config and parameters.""" self.txt_feats = torch.randn(1, nc or 80, 512) # features placeholder self.clip_model = None # CLIP model placeholder super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose) def set_classes(self, text, batch=80, cache_clip_model=True): """Set classes in advance so that model could do offline-inference without clip model.""" try: import clip except ImportError: check_requirements("git+https://github.com/ultralytics/CLIP.git") import clip if ( not getattr(self, "clip_model", None) and cache_clip_model ): # for backwards compatibility of models lacking clip_model attribute self.clip_model = clip.load("ViT-B/32")[0] model = self.clip_model if cache_clip_model else clip.load("ViT-B/32")[0] device = next(model.parameters()).device text_token = clip.tokenize(text).to(device) txt_feats = [model.encode_text(token).detach() for token in text_token.split(batch)] txt_feats = txt_feats[0] if len(txt_feats) == 1 else torch.cat(txt_feats, dim=0) txt_feats = txt_feats / txt_feats.norm(p=2, dim=-1, keepdim=True) self.txt_feats = txt_feats.reshape(-1, len(text), txt_feats.shape[-1]) self.model[-1].nc = len(text) def predict(self, x, profile=False, visualize=False, txt_feats=None, augment=False, embed=None): """ Perform a forward pass through the model. Args: x (torch.Tensor): The input tensor. profile (bool, optional): If True, profile the computation time for each layer. Defaults to False. visualize (bool, optional): If True, save feature maps for visualization. Defaults to False. txt_feats (torch.Tensor): The text features, use it if it's given. Defaults to None. augment (bool, optional): If True, perform data augmentation during inference. Defaults to False. embed (list, optional): A list of feature vectors/embeddings to return. Returns: (torch.Tensor): Model's output tensor. """ txt_feats = (self.txt_feats if txt_feats is None else txt_feats).to(device=x.device, dtype=x.dtype) if len(txt_feats) != len(x): txt_feats = txt_feats.repeat(len(x), 1, 1) ori_txt_feats = txt_feats.clone() y, dt, embeddings = [], [], [] # outputs for m in self.model: # except the head part if m.f != -1: # if not from previous layer x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers if profile: self._profile_one_layer(m, x, dt) if isinstance(m, C2fAttn): x = m(x, txt_feats) elif isinstance(m, WorldDetect): x = m(x, ori_txt_feats) elif isinstance(m, ImagePoolingAttn): txt_feats = m(x, txt_feats) else: x = m(x) # run y.append(x if m.i in self.save else None) # save output if visualize: feature_visualization(x, m.type, m.i, save_dir=visualize) if embed and m.i in embed: embeddings.append(nn.functional.adaptive_avg_pool2d(x, (1, 1)).squeeze(-1).squeeze(-1)) # flatten if m.i == max(embed): return torch.unbind(torch.cat(embeddings, 1), dim=0) return x def loss(self, batch, preds=None): """ Compute loss. Args: batch (dict): Batch to compute loss on. preds (torch.Tensor | List[torch.Tensor]): Predictions. """ if not hasattr(self, "criterion"): self.criterion = self.init_criterion() if preds is None: preds = self.forward(batch["img"], txt_feats=batch["txt_feats"]) return self.criterion(preds, batch) class Ensemble(nn.ModuleList): """Ensemble of models.""" def __init__(self): """Initialize an ensemble of models.""" super().__init__() def forward(self, x, augment=False, profile=False, visualize=False): """Function generates the YOLO network's final layer.""" y = [module(x, augment, profile, visualize)[0] for module in self] # y = torch.stack(y).max(0)[0] # max ensemble # y = torch.stack(y).mean(0) # mean ensemble y = torch.cat(y, 2) # nms ensemble, y shape(B, HW, C) return y, None # inference, train output # Functions ------------------------------------------------------------------------------------------------------------ @contextlib.contextmanager def temporary_modules(modules=None, attributes=None): """ Context manager for temporarily adding or modifying modules in Python's module cache (sys.modules). This function can be used to change the module paths during runtime. It's useful when refactoring code, where you've moved a module from one location to another, but you still want to support the old import paths for backwards compatibility. Args: modules (dict, optional): A dictionary mapping old module paths to new module paths. attributes (dict, optional): A dictionary mapping old module attributes to new module attributes. Example: python with temporary_modules({"old.module": "new.module"}, {"old.module.attribute": "new.module.attribute"}): import old.module # this will now import new.module from old.module import attribute # this will now import new.module.attribute Note: The changes are only in effect inside the context manager and are undone once the context manager exits. Be aware that directly manipulating sys.modules can lead to unpredictable results, especially in larger applications or libraries. Use this function with caution. """ if modules is None: modules = {} if attributes is None: attributes = {} import sys from importlib import import_module try: # Set attributes in sys.modules under their old name for old, new in attributes.items(): old_module, old_attr = old.rsplit(".", 1) new_module, new_attr = new.rsplit(".", 1) setattr(import_module(old_module), old_attr, getattr(import_module(new_module), new_attr)) # Set modules in sys.modules under their old name for old, new in modules.items(): sys.modules[old] = import_module(new) yield finally: # Remove the temporary module paths for old in modules: if old in sys.modules: del sys.modules[old] class SafeClass: """A placeholder class to replace unknown classes during unpickling.""" def __init__(self, *args, **kwargs): """Initialize SafeClass instance, ignoring all arguments.""" pass def __call__(self, *args, **kwargs): """Run SafeClass instance, ignoring all arguments.""" pass class SafeUnpickler(pickle.Unpickler): """Custom Unpickler that replaces unknown classes with SafeClass.""" def find_class(self, module, name): """Attempt to find a class, returning SafeClass if not among safe modules.""" safe_modules = ( "torch", "collections", "collections.abc", "builtins", "math", "numpy", # Add other modules considered safe ) if module in safe_modules: return super().find_class(module, name) else: return SafeClass def torch_safe_load(weight, safe_only=False): """ Attempts to load a PyTorch model with the torch.load() function. If a ModuleNotFoundError is raised, it catches the error, logs a warning message, and attempts to install the missing module via the check_requirements() function. After installation, the function again attempts to load the model using torch.load(). Args: weight (str): The file path of the PyTorch model. safe_only (bool): If True, replace unknown classes with SafeClass during loading. Example: python from ultralytics.nn.tasks import torch_safe_load ckpt, file = torch_safe_load("path/to/best.pt", safe_only=True) Returns: ckpt (dict): The loaded model checkpoint. file (str): The loaded filename """ from ultralytics.utils.downloads import attempt_download_asset check_suffix(file=weight, suffix=".pt") file = attempt_download_asset(weight) # search online if missing locally try: with temporary_modules( modules={ "ultralytics.yolo.utils": "ultralytics.utils", "ultralytics.yolo.v8": "ultralytics.models.yolo", "ultralytics.yolo.data": "ultralytics.data", }, attributes={ "ultralytics.nn.modules.block.Silence": "torch.nn.Identity", # YOLOv9e "ultralytics.nn.tasks.YOLOv10DetectionModel": "ultralytics.nn.tasks.DetectionModel", # YOLOv10 "ultralytics.utils.loss.v10DetectLoss": "ultralytics.utils.loss.E2EDetectLoss", # YOLOv10 }, ): if safe_only: # Load via custom pickle module safe_pickle = types.ModuleType("safe_pickle") safe_pickle.Unpickler = SafeUnpickler safe_pickle.load = lambda file_obj: SafeUnpickler(file_obj).load() with open(file, "rb") as f: ckpt = torch.load(f, pickle_module=safe_pickle) else: ckpt = torch.load(file, map_location="cpu") except ModuleNotFoundError as e: # e.name is missing module name if e.name == "models": raise TypeError( emojis( f"ERROR ❌️ {weight} appears to be an Ultralytics YOLOv5 model originally trained " f"with https://github.com/ultralytics/yolov5.\nThis model is NOT forwards compatible with " f"YOLOv8 at https://github.com/ultralytics/ultralytics." f"\nRecommend fixes are to train a new model using the latest 'ultralytics' package or to " f"run a command with an official Ultralytics model, i.e. 'yolo predict model=yolov8n.pt'" ) ) from e LOGGER.warning( f"WARNING ⚠️ {weight} appears to require '{e.name}', which is not in Ultralytics requirements." f"\nAutoInstall will run now for '{e.name}' but this feature will be removed in the future." f"\nRecommend fixes are to train a new model using the latest 'ultralytics' package or to " f"run a command with an official Ultralytics model, i.e. 'yolo predict model=yolov8n.pt'" ) check_requirements(e.name) # install missing module ckpt = torch.load(file, map_location="cpu") if not isinstance(ckpt, dict): # File is likely a YOLO instance saved with i.e. torch.save(model, "saved_model.pt") LOGGER.warning( f"WARNING ⚠️ The file '{weight}' appears to be improperly saved or formatted. " f"For optimal results, use model.save('filename.pt') to correctly save YOLO models." ) ckpt = {"model": ckpt.model} return ckpt, file def attempt_load_weights(weights, device=None, inplace=True, fuse=False): """Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a.""" ensemble = Ensemble() for w in weights if isinstance(weights, list) else [weights]: ckpt, w = torch_safe_load(w) # load ckpt args = {**DEFAULT_CFG_DICT, **ckpt["train_args"]} if "train_args" in ckpt else None # combined args model = (ckpt.get("ema") or ckpt["model"]).to(device).float() # FP32 model # Model compatibility updates model.args = args # attach args to model model.pt_path = w # attach *.pt file path to model model.task = guess_model_task(model) if not hasattr(model, "stride"): model.stride = torch.tensor([32.0]) # Append ensemble.append(model.fuse().eval() if fuse and hasattr(model, "fuse") else model.eval()) # model in eval mode # Module updates for m in ensemble.modules(): if hasattr(m, "inplace"): m.inplace = inplace elif isinstance(m, nn.Upsample) and not hasattr(m, "recompute_scale_factor"): m.recompute_scale_factor = None # torch 1.11.0 compatibility # Return model if len(ensemble) == 1: return ensemble[-1] # Return ensemble LOGGER.info(f"Ensemble created with {weights}\n") for k in "names", "nc", "yaml": setattr(ensemble, k, getattr(ensemble[0], k)) ensemble.stride = ensemble[int(torch.argmax(torch.tensor([m.stride.max() for m in ensemble])))].stride assert all(ensemble[0].nc == m.nc for m in ensemble), f"Models differ in class counts {[m.nc for m in ensemble]}" return ensemble def attempt_load_one_weight(weight, device=None, inplace=True, fuse=False): """Loads a single model weights.""" ckpt, weight = torch_safe_load(weight) # load ckpt args = {**DEFAULT_CFG_DICT, **(ckpt.get("train_args", {}))} # combine model and default args, preferring model args model = (ckpt.get("ema") or ckpt["model"]).to(device).float() # FP32 model # Model compatibility updates model.args = {k: v for k, v in args.items() if k in DEFAULT_CFG_KEYS} # attach args to model model.pt_path = weight # attach *.pt file path to model model.task = guess_model_task(model) if not hasattr(model, "stride"): model.stride = torch.tensor([32.0]) model = model.fuse().eval() if fuse and hasattr(model, "fuse") else model.eval() # model in eval mode # Module updates for m in model.modules(): if hasattr(m, "inplace"): m.inplace = inplace elif isinstance(m, nn.Upsample) and not hasattr(m, "recompute_scale_factor"): m.recompute_scale_factor = None # torch 1.11.0 compatibility # Return model and ckpt return model, ckpt def parse_model(d, ch, verbose=True): # model_dict, input_channels(3) """Parse a YOLO model.yaml dictionary into a PyTorch model.""" import ast # Args legacy = True # backward compatibility for v3/v5/v8/v9 models max_channels = float("inf") nc, act, scales = (d.get(x) for x in ("nc", "activation", "scales")) depth, width, kpt_shape = (d.get(x, 1.0) for x in ("depth_multiple", "width_multiple", "kpt_shape")) if scales: scale = d.get("scale") if not scale: scale = tuple(scales.keys())[0] LOGGER.warning(f"WARNING ⚠️ no model scale passed. Assuming scale='{scale}'.") depth, width, max_channels = scales[scale] if act: Conv.default_act = eval(act) # redefine default activation, i.e. Conv.default_act = nn.SiLU() if verbose: LOGGER.info(f"{colorstr('activation:')} {act}") # print if verbose: LOGGER.info(f"\n{'':>3}{'from':>20}{'n':>3}{'params':>10} {'module':<45}{'arguments':<30}") ch = [ch] layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out backbone = False for i, (f, n, m, args) in enumerate(d["backbone"] + d["head"]): # from, number, module, args t=m m = getattr(torch.nn, m[3:]) if "nn." in m else globals()[m] # get module for j, a in enumerate(args): if isinstance(a, str): with contextlib.suppress(ValueError): args[j] = locals()[a] if a in locals() else ast.literal_eval(a) n = n_ = max(round(n * depth), 1) if n > 1 else n # depth gain if m in { Classify, Conv, ConvTranspose, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, C2fPSA, C2PSA, DWConv, Focus, BottleneckCSP, C1, C2, C2f, C3k2, RepNCSPELAN4, ELAN1, ADown, AConv, SPPELAN, C2fAttn, C3, C3TR, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x, RepC3, PSA, SCDown, C2fCIB, A2C2f, }: c1, c2 = ch[f], args[0] if c2 != nc: # if c2 not equal to number of classes (i.e. for Classify() output) c2 = make_divisible(min(c2, max_channels) * width, 8) if m is C2fAttn: args[1] = make_divisible(min(args[1], max_channels // 2) * width, 8) # embed channels args[2] = int( max(round(min(args[2], max_channels // 2 // 32)) * width, 1) if args[2] > 1 else args[2] ) # num heads args = [c1, c2, *args[1:]] if m in { BottleneckCSP, C1, C2, C2f, C3k2, C2fAttn, C3, C3TR, C3Ghost, C3x, RepC3, C2fPSA, C2fCIB, C2PSA, A2C2f, }: args.insert(2, n) # number of repeats n = 1 if m is C3k2: # for M/L/X sizes legacy = False if scale in "mlx": args[3] = True if m is A2C2f: legacy = False if scale in "lx": # for L/X sizes args.append(True) args.append(1.2) elif m is AIFI: args = [ch[f], *args] elif m in {HGStem, HGBlock}: c1, cm, c2 = ch[f], args[0], args[1] args = [c1, cm, c2, *args[2:]] if m is HGBlock: args.insert(4, n) # number of repeats n = 1 elif m is ResNetLayer: c2 = args[1] if args[3] else args[1] * 4 elif m is nn.BatchNorm2d: args = [ch[f]] #LSKNet elif m in {LSKNET_T,LSKNET_S}: m = m(*args) c2 = m.width_list backbone =True #BiFPN elif m is BiFPN: length = len([ch[x] for x in f]) args = [length] elif m is Concat: c2 = sum(ch[x] for x in f) elif m in {Detect, WorldDetect, Segment, Pose, OBB, ImagePoolingAttn, v10Detect}: args.append([ch[x] for x in f]) if m is Segment: args[2] = make_divisible(min(args[2], max_channels) * width, 8) if m in {Detect, Segment, Pose, OBB}: m.legacy = legacy elif m is RTDETRDecoder: # special case, channels arg must be passed in index 1 args.insert(1, [ch[x] for x in f]) elif m in {CBLinear, TorchVision, Index}: c2 = args[0] c1 = ch[f] args = [c1, c2, *args[1:]] elif m is CBFuse: c2 = ch[f[-1]] else: c2 = ch[f] # m_ = nn.Sequential(*(m(*args) for _ in range(n))) if n > 1 else m(*args) # module # t = str(m)[8:-2].replace("__main__.", "") # module type # m_.np = sum(x.numel() for x in m_.parameters()) # number params # m_.i, m_.f, m_.type = i, f, t # attach index, 'from' index, type # if verbose: # LOGGER.info(f"{i:>3}{str(f):>20}{n_:>3}{m_.np:10.0f} {t:<45}{str(args):<30}") # print # save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist # layers.append(m_) # if i == 0: # ch = [] # ch.append(c2) #替换上面的 if isinstance(c2, list): backbone = True m_ = m m_.backbone = True else: m_ = nn.Sequential(*(m(*args) for _ in range(n))) if n > 1 else m(*args) # module t = str(m)[8:-2].replace('__main__.', '') # module type m.np = sum(x.numel() for x in m_.parameters()) # number params m_.i, m_.f, m_.type = i + 4 if backbone else i, f, t # attach index, 'from' index, type if verbose: LOGGER.info(f'{i:>3}{str(f):>20}{n_:>3}{m.np:10.0f} {t:<45}{str(args):<30}') # print save.extend(x % (i + 4 if backbone else i) for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist layers.append(m_) if i == 0: ch = [] if isinstance(c2, list): ch.extend(c2) for _ in range(5 - len(ch)): ch.insert(0, 0) else: ch.append(c2) return nn.Sequential(*layers), sorted(save) def yaml_model_load(path): """Load a YOLOv8 model from a YAML file.""" path = Path(path) if path.stem in (f"yolov{d}{x}6" for x in "nsmlx" for d in (5, 8)): new_stem = re.sub(r"(\d+)([nslmx])6(.+)?$", r"\1\2-p6\3", path.stem) LOGGER.warning(f"WARNING ⚠️ Ultralytics YOLO P6 models now use -p6 suffix. Renaming {path.stem} to {new_stem}.") path = path.with_name(new_stem + path.suffix) unified_path = re.sub(r"(\d+)([nslmx])(.+)?$", r"\1\3", str(path)) # i.e. yolov8x.yaml -> yolov8.yaml yaml_file = check_yaml(unified_path, hard=False) or check_yaml(path) d = yaml_load(yaml_file) # model dict d["scale"] = guess_model_scale(path) d["yaml_file"] = str(path) return d def guess_model_scale(model_path): """ Takes a path to a YOLO model's YAML file as input and extracts the size character of the model's scale. The function uses regular expression matching to find the pattern of the model scale in the YAML file name, which is denoted by n, s, m, l, or x. The function returns the size character of the model scale as a string. Args: model_path (str | Path): The path to the YOLO model's YAML file. Returns: (str): The size character of the model's scale, which can be n, s, m, l, or x. """ try: return re.search(r"yolo[v]?\d+([nslmx])", Path(model_path).stem).group(1) # noqa, returns n, s, m, l, or x except AttributeError: return "" def guess_model_task(model): """ Guess the task of a PyTorch model from its architecture or configuration. Args: model (nn.Module | dict): PyTorch model or model configuration in YAML format. Returns: (str): Task of the model ('detect', 'segment', 'classify', 'pose'). Raises: SyntaxError: If the task of the model could not be determined. """ def cfg2task(cfg): """Guess from YAML dictionary.""" m = cfg["head"][-1][-2].lower() # output module name if m in {"classify", "classifier", "cls", "fc"}: return "classify" if "detect" in m: return "detect" if m == "segment": return "segment" if m == "pose": return "pose" if m == "obb": return "obb" # Guess from model cfg if isinstance(model, dict): with contextlib.suppress(Exception): return cfg2task(model) # Guess from PyTorch model if isinstance(model, nn.Module): # PyTorch model for x in "model.args", "model.model.args", "model.model.model.args": with contextlib.suppress(Exception): return eval(x)["task"] for x in "model.yaml", "model.model.yaml", "model.model.model.yaml": with contextlib.suppress(Exception): return cfg2task(eval(x)) for m in model.modules(): if isinstance(m, Segment): return "segment" elif isinstance(m, Classify): return "classify" elif isinstance(m, Pose): return "pose" elif isinstance(m, OBB): return "obb" elif isinstance(m, (Detect, WorldDetect, v10Detect)): return "detect" # Guess from model filename if isinstance(model, (str, Path)): model = Path(model) if "-seg" in model.stem or "segment" in model.parts: return "segment" elif "-cls" in model.stem or "classify" in model.parts: return "classify" elif "-pose" in model.stem or "pose" in model.parts: return "pose" elif "-obb" in model.stem or "obb" in model.parts: return "obb" elif "detect" in model.parts: return "detect" # Unable to determine task from model LOGGER.warning( "WARNING ⚠️ Unable to automatically guess model task, assuming 'task=detect'. " "Explicitly define task for your model, i.e. 'task=detect', 'segment', 'classify','pose' or 'obb'." ) return "detect" # assume detect 这上面是task.py文件的内容。下面是LSKNet.py文件的内容。 import torch import torch.nn as nn from torch.nn.modules.utils import _pair as to_2tuple from timm.models.layers import DropPath, to_2tuple from functools import partial import warnings __all__ = ['LSKNET_T', 'LSKNET_S'] class Mlp(nn.Module): def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.): super().__init__() out_features = out_features or in_features hidden_features = hidden_features or in_features self.fc1 = nn.Conv2d(in_features, hidden_features, 1) self.dwconv = DWConv(hidden_features) self.act = act_layer() self.fc2 = nn.Conv2d(hidden_features, out_features, 1) self.drop = nn.Dropout(drop) def forward(self, x): x = self.fc1(x) x = self.dwconv(x) x = self.act(x) x = self.drop(x) x = self.fc2(x) x = self.drop(x) return x class LSKblock(nn.Module): def __init__(self, dim): super().__init__() self.conv0 = nn.Conv2d(dim, dim, 5, padding=2, groups=dim) self.conv_spatial = nn.Conv2d(dim, dim, 7, stride=1, padding=9, groups=dim, dilation=3) self.conv1 = nn.Conv2d(dim, dim // 2, 1) self.conv2 = nn.Conv2d(dim, dim // 2, 1) self.conv_squeeze = nn.Conv2d(2, 2, 7, padding=3) self.conv = nn.Conv2d(dim // 2, dim, 1) def forward(self, x): attn1 = self.conv0(x) attn2 = self.conv_spatial(attn1) attn1 = self.conv1(attn1) attn2 = self.conv2(attn2) attn = torch.cat([attn1, attn2], dim=1) avg_attn = torch.mean(attn, dim=1, keepdim=True) max_attn, _ = torch.max(attn, dim=1, keepdim=True) agg = torch.cat([avg_attn, max_attn], dim=1) sig = self.conv_squeeze(agg).sigmoid() attn = attn1 * sig[:, 0, :, :].unsqueeze(1) + attn2 * sig[:, 1, :, :].unsqueeze(1) attn = self.conv(attn) return x * attn class Attention(nn.Module): def __init__(self, d_model): super().__init__() self.proj_1 = nn.Conv2d(d_model, d_model, 1) self.activation = nn.GELU() self.spatial_gating_unit = LSKblock(d_model) self.proj_2 = nn.Conv2d(d_model, d_model, 1) def forward(self, x): shorcut = x.clone() x = self.proj_1(x) x = self.activation(x) x = self.spatial_gating_unit(x) x = self.proj_2(x) x = x + shorcut return x class Block(nn.Module): def __init__(self, dim, mlp_ratio=4., drop=0., drop_path=0., act_layer=nn.GELU, norm_cfg=None): super().__init__() if norm_cfg: self.norm1 = nn.BatchNorm2d(norm_cfg, dim) self.norm2 = nn.BatchNorm2d(norm_cfg, dim) else: self.norm1 = nn.BatchNorm2d(dim) self.norm2 = nn.BatchNorm2d(dim) self.attn = Attention(dim) self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity() mlp_hidden_dim = int(dim * mlp_ratio) self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop) layer_scale_init_value = 1e-2 self.layer_scale_1 = nn.Parameter( layer_scale_init_value * torch.ones((dim)), requires_grad=True) self.layer_scale_2 = nn.Parameter( layer_scale_init_value * torch.ones((dim)), requires_grad=True) def forward(self, x): x = x + self.drop_path(self.layer_scale_1.unsqueeze(-1).unsqueeze(-1) * self.attn(self.norm1(x))) x = x + self.drop_path(self.layer_scale_2.unsqueeze(-1).unsqueeze(-1) * self.mlp(self.norm2(x))) return x class OverlapPatchEmbed(nn.Module): """ Image to Patch Embedding """ def __init__(self, img_size=224, patch_size=7, stride=4, in_chans=3, embed_dim=768, norm_cfg=None): super().__init__() patch_size = to_2tuple(patch_size) self.proj = nn.Conv2d(in_chans, embed_dim, kernel_size=patch_size, stride=stride, padding=(patch_size[0] // 2, patch_size[1] // 2)) if norm_cfg: self.norm = nn.BatchNorm2d(norm_cfg, embed_dim) else: self.norm = nn.BatchNorm2d(embed_dim) def forward(self, x): x = self.proj(x) _, _, H, W = x.shape x = self.norm(x) return x, H, W class LSKNet(nn.Module): def __init__(self, img_size=224, in_chans=3, dim=None, embed_dims=[64, 128, 256, 512], mlp_ratios=[8, 8, 4, 4], drop_rate=0., drop_path_rate=0., norm_layer=partial(nn.LayerNorm, eps=1e-6), depths=[3, 4, 6, 3], num_stages=4, pretrained=None, init_cfg=None, norm_cfg=None): super().__init__() assert not (init_cfg and pretrained), \ 'init_cfg and pretrained cannot be set at the same time' if isinstance(pretrained, str): warnings.warn('DeprecationWarning: pretrained is deprecated, ' 'please use "init_cfg" instead') self.init_cfg = dict(type='Pretrained', checkpoint=pretrained) elif pretrained is not None: raise TypeError('pretrained must be a str or None') self.depths = depths self.num_stages = num_stages dpr = [x.item() for x in torch.linspace(0, drop_path_rate, sum(depths))] # stochastic depth decay rule cur = 0 for i in range(num_stages): patch_embed = OverlapPatchEmbed(img_size=img_size if i == 0 else img_size // (2 ** (i + 1)), patch_size=7 if i == 0 else 3, stride=4 if i == 0 else 2, in_chans=in_chans if i == 0 else embed_dims[i - 1], embed_dim=embed_dims[i], norm_cfg=norm_cfg) block = nn.ModuleList([Block( dim=embed_dims[i], mlp_ratio=mlp_ratios[i], drop=drop_rate, drop_path=dpr[cur + j], norm_cfg=norm_cfg) for j in range(depths[i])]) norm = norm_layer(embed_dims[i]) cur += depths[i] setattr(self, f"patch_embed{i + 1}", patch_embed) setattr(self, f"block{i + 1}", block) setattr(self, f"norm{i + 1}", norm) self.width_list = [i.size(1) for i in self.forward(torch.randn(1, 3, 640, 640))] def freeze_patch_emb(self): self.patch_embed1.requires_grad = False @torch.jit.ignore def no_weight_decay(self): return {'pos_embed1', 'pos_embed2', 'pos_embed3', 'pos_embed4', 'cls_token'} # has pos_embed may be better def get_classifier(self): return self.head def reset_classifier(self, num_classes, global_pool=''): self.num_classes = num_classes self.head = nn.Linear(self.embed_dim, num_classes) if num_classes > 0 else nn.Identity() def forward_features(self, x): B = x.shape[0] outs = [] for i in range(self.num_stages): patch_embed = getattr(self, f"patch_embed{i + 1}") block = getattr(self, f"block{i + 1}") norm = getattr(self, f"norm{i + 1}") x, H, W = patch_embed(x) for blk in block: x = blk(x) x = x.flatten(2).transpose(1, 2) x = norm(x) x = x.reshape(B, H, W, -1).permute(0, 3, 1, 2).contiguous() outs.append(x) return outs def forward(self, x): x = self.forward_features(x) # x = self.head(x) return x class DWConv(nn.Module): def __init__(self, dim=768): super(DWConv, self).__init__() self.dwconv = nn.Conv2d(dim, dim, 3, 1, 1, bias=True, groups=dim) def forward(self, x): x = self.dwconv(x) return x def _conv_filter(state_dict, patch_size=16): """ convert patch embedding weight from manual patchify + linear proj to conv""" out_dict = {} for k, v in state_dict.items(): if 'patch_embed.proj.weight' in k: v = v.reshape((v.shape[0], 3, patch_size, patch_size)) out_dict[k] = v return out_dict def LSKNET_T(): model = LSKNet(depths=[2, 2, 2, 2]) return model def LSKNET_S(): model = LSKNet() return model if __name__ == '__main__': model = LSKNet() inputs = torch.randn((1, 3, 640, 640)) for i in model(inputs): print(i.size()) 最下面是BiFPN.py文件,请你结合这三个文件和上面刚刚的运行错误解决这个问题。 import torch.nn as nn import torch class swish(nn.Module): def forward(self, x): return x * torch.sigmoid(x) class BiFPN(nn.Module): def __init__(self, length): super().__init__() self.weight = nn.Parameter(torch.ones(length, dtype=torch.float32), requires_grad=True) self.swish = swish() self.epsilon = 0.0001 def forward(self, x): weights = self.weight / (torch.sum(self.swish(self.weight), dim=0) + self.epsilon) weighted_feature_maps = [weights[i] * x[i] for i in range(len(x))] stacked_feature_maps = torch.stack(weighted_feature_maps, dim=0) result = torch.sum(stacked_feature_maps, dim=0) return result

import re import logging import tkinter as tk from tkinter import scrolledtext, ttk, messagebox from datetime import datetime import traceback class SimpleCLexer: def __init__(self): self.tokens = [] def tokenize(self, input_str): tokens = [] pos = 0 line = 1 column = 0 length = len(input_str) # 定义C语言的关键词和类型 keywords = { 'void', 'int', 'char', 'float', 'double', 'short', 'long', 'signed', 'unsigned', 'struct', 'union', 'enum', 'typedef', 'static', 'extern', 'auto', 'register', 'const', 'volatile', 'return', 'if', 'else', 'switch', 'case', 'default', 'for', 'while', 'do', 'break', 'continue', 'goto', 'sizeof' } # 扩展类型别名识别 types = {'U1', 'U2', 'U4', 'S1', 'S2', 'S4', 'BOOL', 'BYTE', 'WORD', 'DWORD'} while pos < length: char = input_str[pos] # 跳过空白字符 if char in ' \t': pos += 1 column += 1 continue # 处理换行 if char == '\n': line += 1 column = 0 pos += 1 continue # 处理单行注释 if pos + 1 < length and input_str[pos:pos+2] == '//': end = input_str.find('\n', pos) if end == -1: end = length pos = end continue # 处理多行注释 if pos + 1 < length and input_str[pos:pos+2] == '/*': end = input_str.find('*/', pos + 2) if end == -1: end = length else: end += 2 pos = end continue # 处理标识符 if char.isalpha() or char == '_': start = pos pos += 1 while pos < length and (input_str[pos].isalnum() or input_str[pos] == '_'): pos += 1 token_text = input_str[start:pos] token_type = 'IDENTIFIER' # 检查是否为关键字或类型 if token_text in keywords: token_type = 'KEYWORD' elif token_text in types: token_type = 'TYPE' tokens.append({ 'type': token_type, 'text': token_text, 'line': line, 'column': column }) column += (pos - start) continue # 处理数字 if char.isdigit(): start = pos pos += 1 while pos < length and (input_str[pos].isdigit() or input_str[pos] in '.xXabcdefABCDEF'): pos += 1 tokens.append({ 'type': 'NUMBER', 'text': input_str[start:pos], 'line': line, 'column': column }) column += (pos - start) continue # 处理字符串 if char == '"': start = pos pos += 1 while pos < length and input_str[pos] != '"': if input_str[pos] == '\\' and pos + 1 < length: pos += 2 else: pos += 1 if pos < length and input_str[pos] == '"': pos += 1 tokens.append({ 'type': 'STRING', 'text': input_str[start:pos], 'line': line, 'column': column }) column += (pos - start) continue # 处理字符 if char == "'": start = pos pos += 1 while pos < length and input_str[pos] != "'": if input_str[pos] == '\\' and pos + 1 < length: pos += 2 else: pos += 1 if pos < length and input_str[pos] == "'": pos += 1 tokens.append({ 'type': 'CHAR', 'text': input_str[start:pos], 'line': line, 'column': column }) column += (pos - start) continue # 处理运算符和标点符号 operators = { '(', ')', '{', '}', '[', ']', ';', ',', '.', '->', '++', '--', '&', '*', '+', '-', '~', '!', '/', '%', '<<', '>>', '<', '>', '<=', '>=', '==', '!=', '^', '|', '&&', '||', '?', ':', '=', '+=', '-=', '*=', '/=', '%=', '<<=', '>>=', '&=', '^=', '|=', ',' } # 尝试匹配最长的运算符 matched = False for op_len in range(3, 0, -1): if pos + op_len <= length and input_str[pos:pos+op_len] in operators: tokens.append({ 'type': 'OPERATOR', 'text': input_str[pos:pos+op_len], 'line': line, 'column': column }) pos += op_len column += op_len matched = True break if matched: continue # 无法识别的字符 tokens.append({ 'type': 'UNKNOWN', 'text': char, 'line': line, 'column': column }) pos += 1 column += 1 return tokens class EnhancedFunctionAnalyzer: def __init__(self): self.function_name = "" self.parameters = [] self.global_vars = [] self.function_calls = [] self.current_function = None self.in_function_body = False self.brace_depth = 0 self.variable_declarations = {} self.macro_definitions = set() self.storage_classes = {"static", "extern", "auto", "register"} self.local_vars = [] self.structs = [] self.arrays = [] self.struct_tags = set() self.recorded_locals = set() self.recorded_globals = set() self.recorded_params = set() self.local_scope_stack = [set()] # 基本类型和类型别名 self.basic_types = {'void', 'int', 'char', 'float', 'double', 'short', 'long', 'signed', 'unsigned'} self.type_aliases = {"U1", "U2", "U4", "S1", "S2", "S4", "BOOL", "BYTE", "WORD", "DWORD"} self.allowed_types = self.basic_types | self.type_aliases self.allowed_types.add('struct') def analyze(self, tokens): self.tokens = tokens self.pos = 0 self.current_line = 0 self.brace_depth = 0 self.local_scope_stack = [set()] # 第一步:识别宏定义(全大写标识符) self._identify_macros() # 第二步:识别函数定义 self._find_function_definition() # 第三步:识别函数体内的内容 if self.function_name: self._analyze_function_body() return self def _identify_macros(self): """识别宏定义(全大写标识符)""" for token in self.tokens: if token['type'] == 'IDENTIFIER' and token['text'].isupper(): self.macro_definitions.add(token['text']) def _find_function_definition(self): """查找函数定义并提取函数名和参数""" self.pos = 0 while self.pos < len(self.tokens): token = self.tokens[self.pos] self.current_line = token['line'] # 跳过非类型开头的token if token['text'] not in self.allowed_types and token['text'] not in self.storage_classes: self.pos += 1 continue # 尝试识别函数定义 if self._is_function_definition(): self._extract_function_signature() return self.pos += 1 def _is_function_definition(self): """检查当前位置是否是函数定义开始""" start_pos = self.pos found_function_name = False found_paren = False # 跳过存储类说明符 if self.tokens[start_pos]['text'] in self.storage_classes: start_pos += 1 # 检查返回类型 if start_pos >= len(self.tokens) or self.tokens[start_pos]['text'] not in self.allowed_types: return False # 查找函数名 pos = start_pos + 1 while pos < len(self.tokens): token = self.tokens[pos] # 找到左括号,说明是函数定义 if token['text'] == '(': found_paren = True break # 找到标识符,可能是函数名 if token['type'] == 'IDENTIFIER' and not found_function_name: found_function_name = True pos += 1 return found_function_name and found_paren def _extract_function_signature(self): """提取函数签名(函数名和参数)""" # 提取存储类(如果有) storage_class = None if self.tokens[self.pos]['text'] in self.storage_classes: storage_class = self.tokens[self.pos]['text'] self.pos += 1 # 提取返回类型 return_type = self.tokens[self.pos]['text'] self.pos += 1 # 处理指针类型 if self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == '*': return_type += '*' self.pos += 1 # 提取函数名 - 函数名是 '(' 前的最后一个标识符 func_name = None while self.pos < len(self.tokens) and self.tokens[self.pos]['text'] != '(': if self.tokens[self.pos]['type'] == 'IDENTIFIER': func_name = self.tokens[self.pos]['text'] self.pos += 1 if not func_name: return self.function_name = func_name self.current_function = func_name # 跳过 '(' if self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == '(': self.pos += 1 # 提取参数 params = [] current_param = [] depth = 1 param_line = self.current_line while self.pos < len(self.tokens) and depth > 0: token = self.tokens[self.pos] if token['text'] == '(': depth += 1 elif token['text'] == ')': depth -= 1 if depth == 0: break elif token['text'] == ',' and depth == 1: # 提取参数类型和名称 param_type, param_name = self._extract_param_info(current_param) if param_type and param_name: params.append({ 'type': param_type, 'name': param_name, 'line': param_line }) self.variable_declarations[param_name] = True current_param = [] param_line = token['line'] self.pos += 1 continue current_param.append(token) self.pos += 1 # 处理最后一个参数 if current_param: param_type, param_name = self._extract_param_info(current_param) if param_type and param_name: params.append({ 'type': param_type, 'name': param_name, 'line': param_line }) self.variable_declarations[param_name] = True # 记录参数 self.parameters = params for param in params: self.recorded_params.add(param['name']) # 查找函数体开头的 '{' while self.pos < len(self.tokens) and self.tokens[self.pos]['text'] != '{': self.pos += 1 if self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == '{': self.in_function_body = True self.brace_depth = 1 self.pos += 1 def _extract_param_info(self, tokens): """从参数token列表中提取类型和名称""" param_type = [] param_name = None # 首先收集所有类型部分 type_end_index = -1 for i, token in enumerate(tokens): if token['type'] in ('KEYWORD', 'TYPE') or token['text'] in self.allowed_types: param_type.append(token['text']) type_end_index = i else: break # 然后查找参数名 for i in range(type_end_index + 1, len(tokens)): token = tokens[i] if token['type'] == 'IDENTIFIER' and not token['text'].isupper(): param_name = token['text'] break return ' '.join(param_type), param_name def _analyze_function_body(self): """分析函数体内容""" while self.pos < len(self.tokens) and self.brace_depth > 0: token = self.tokens[self.pos] self.current_line = token['line'] # 处理作用域开始 if token['text'] == '{': self.brace_depth += 1 self.local_scope_stack.append(set()) self.pos += 1 continue # 处理作用域结束 if token['text'] == '}': self.brace_depth -= 1 if self.local_scope_stack: self.local_scope_stack.pop() self.pos += 1 continue # 检测变量声明 if token['text'] in self.allowed_types or token['text'] in self.storage_classes: self._handle_variable_declaration() continue # 检测结构体声明 if token['text'] == 'struct': self._handle_struct_declaration() continue # 检测函数调用 if token['type'] == 'IDENTIFIER' and self.pos + 1 < len(self.tokens): next_token = self.tokens[self.pos + 1] if next_token['text'] == '(': self._handle_function_call() continue # 检测结构体成员访问 if token['type'] == 'IDENTIFIER' and self.pos + 1 < len(self.tokens): next_token = self.tokens[self.pos + 1] if next_token['text'] == '.': # 记录结构体变量 struct_var = token['text'] if struct_var not in self.structs: self.structs.append({ 'name': struct_var, 'line': token['line'], 'scope': 'local' if struct_var in self.recorded_locals else 'global' }) # 跳过成员访问部分 self.pos += 2 continue # 检测全局变量使用 if token['type'] == 'IDENTIFIER' and not token['text'].isupper(): var_name = token['text'] if (var_name not in self.variable_declarations and var_name not in self.macro_definitions and var_name != self.function_name and var_name not in self.struct_tags): if var_name not in self.recorded_globals: self.global_vars.append({ 'name': var_name, 'line': token['line'], 'scope': 'global' }) self.recorded_globals.add(var_name) self.variable_declarations[var_name] = True self.pos += 1 def _handle_struct_declaration(self): """处理结构体声明""" start_pos = self.pos current_line = self.current_line is_struct = True # 跳过 'struct' 关键字 self.pos += 1 # 获取结构体类型名 struct_type = None if self.pos < len(self.tokens) and self.tokens[self.pos]['type'] == 'IDENTIFIER': struct_type = self.tokens[self.pos]['text'] self.struct_tags.add(struct_type) self.allowed_types.add(struct_type) self.pos += 1 # 跳过结构体定义(如果有) if self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == '{': depth = 1 self.pos += 1 while self.pos < len(self.tokens) and depth > 0: if self.tokens[self.pos]['text'] == '{': depth += 1 elif self.tokens[self.pos]['text'] == '}': depth -= 1 self.pos += 1 # 获取结构体变量名 var_name = None if self.pos < len(self.tokens) and self.tokens[self.pos]['type'] == 'IDENTIFIER': var_name = self.tokens[self.pos]['text'] self.pos += 1 # 处理数组声明 array_dims = [] while self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == '[': self.pos += 1 dim = "" while self.pos < len(self.tokens) and self.tokens[self.pos]['text'] != ']': dim += self.tokens[self.pos]['text'] self.pos += 1 if self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == ']': self.pos += 1 array_dims.append(dim) # 创建变量信息 if var_name: var_type = f"struct {struct_type}" if struct_type else "struct" for dim in array_dims: var_type += f"[{dim}]" var_info = { 'type': var_type, 'name': var_name, 'line': current_line, 'is_struct': True, 'struct_type': struct_type, 'is_array': bool(array_dims), 'array_dims': array_dims } # 添加到局部变量 self.local_vars.append(var_info) self.recorded_locals.add(var_name) self.variable_declarations[var_name] = True # 如果是数组,单独记录 if array_dims: self.arrays.append(var_info) # 记录结构体 self.structs.append(var_info) def _handle_variable_declaration(self): """处理变量声明""" start_pos = self.pos current_line = self.current_line # 获取变量类型 var_type = self.tokens[self.pos]['text'] self.pos += 1 # 处理指针声明 while self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == '*': var_type += '*' self.pos += 1 # 获取变量名 var_name = None if self.pos < len(self.tokens) and self.tokens[self.pos]['type'] == 'IDENTIFIER': var_name = self.tokens[self.pos]['text'] self.pos += 1 # 跳过非变量声明的情况(如函数调用) if not var_name or var_name.isupper(): self.pos = start_pos return # 处理数组声明 array_dims = [] while self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == '[': self.pos += 1 dim = "" while self.pos < len(self.tokens) and self.tokens[self.pos]['text'] != ']': dim += self.tokens[self.pos]['text'] self.pos += 1 if self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == ']': self.pos += 1 array_dims.append(dim) # 处理初始化 if self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == '=': self.pos += 1 depth = 0 while self.pos < len(self.tokens): t = self.tokens[self.pos] if t['text'] in {'(', '['}: depth += 1 elif t['text'] in {')', ']'}: depth -= 1 elif t['text'] in {',', ';'} and depth == 0: break self.pos += 1 # 创建变量信息 for dim in array_dims: var_type += f"[{dim}]" var_info = { 'type': var_type, 'name': var_name, 'line': current_line, 'is_struct': False, 'is_array': bool(array_dims), 'array_dims': array_dims } # 添加到局部变量 self.local_vars.append(var_info) self.recorded_locals.add(var_name) self.variable_declarations[var_name] = True # 如果是数组,单独记录 if array_dims: self.arrays.append(var_info) def _handle_function_call(self): """处理函数调用""" # 提取函数名 func_name = self.tokens[self.pos]['text'] line = self.current_line # 跳过函数名和 '(' self.pos += 2 # 提取参数 params = [] current_param = [] depth = 1 param_line = line while self.pos < len(self.tokens) and depth > 0: token = self.tokens[self.pos] if token['text'] == '(': depth += 1 elif token['text'] == ')': depth -= 1 if depth == 0: break elif token['text'] == ',' and depth == 1: # 提取参数表达式 param_text = ''.join([t['text'] for t in current_param]).strip() params.append(param_text) current_param = [] param_line = token['line'] self.pos += 1 continue current_param.append(token) self.pos += 1 if current_param: param_text = ''.join([t['text'] for t in current_param]).strip() params.append(param_text) # 跳过 ')' if self.pos < len(self.tokens) and self.tokens[self.pos]['text'] == ')': self.pos += 1 # 确定返回类型 return_type = "unknown" if func_name.startswith("vd_"): return_type = "void" elif func_name.startswith(("u1_", "u2_", "u4_", "s1_", "s2_", "s4_")): prefix = func_name.split("_")[0] return_type = prefix.upper() # 添加到函数调用列表 self.function_calls.append({ 'name': func_name, 'return_type': return_type, 'params': ", ".join(params), 'line': line }) class FunctionParserApp: def __init__(self, root): self.root = root self.root.title("C语言函数解析器") self.root.geometry("1000x800") self.root.configure(bg="#f0f0f0") self.setup_logging() # 创建样式 style = ttk.Style() style.configure("TFrame", background="#f0f0f0") style.configure("TLabelFrame", background="#f0f0f0", font=("Arial", 10, "bold")) style.configure("TButton", font=("Arial", 10), padding=5) style.configure("TProgressbar", thickness=10) # 主框架 main_frame = ttk.Frame(root) main_frame.pack(fill="both", expand=True, padx=15, pady=15) # 创建输入区域 input_frame = ttk.LabelFrame(main_frame, text="输入C语言函数体") input_frame.pack(fill="both", expand=True, padx=5, pady=5) self.input_text = scrolledtext.ScrolledText(input_frame, width=100, height=15, font=("Consolas", 11), bg="#ffffff") self.input_text.pack(fill="both", expand=True, padx=10, pady=10) # 按钮区域 btn_frame = ttk.Frame(main_frame) btn_frame.pack(fill="x", padx=5, pady=5) # 解析按钮 parse_btn = ttk.Button(btn_frame, text="解析函数", command=self.parse_function) parse_btn.pack(side="left", padx=5) # 进度条 self.progress = ttk.Progressbar(btn_frame, orient="horizontal", length=300, mode="determinate") self.progress.pack(side="left", padx=10, fill="x", expand=True) # 示例按钮 example_btn = ttk.Button(btn_frame, text="加载示例", command=self.load_example) example_btn.pack(side="right", padx=5) # 创建输出区域 output_frame = ttk.LabelFrame(main_frame, text="解析结果") output_frame.pack(fill="both", expand=True, padx=5, pady=5) self.output_text = scrolledtext.ScrolledText(output_frame, width=100, height=15, font=("Consolas", 11), bg="#ffffff") self.output_text.pack(fill="both", expand=True, padx=10, pady=10) self.output_text.config(state=tk.DISABLED) # 日志区域 log_frame = ttk.LabelFrame(main_frame, text="日志信息") log_frame.pack(fill="both", expand=True, padx=5, pady=5) self.log_text = scrolledtext.ScrolledText(log_frame, width=100, height=6, font=("Consolas", 10), bg="#f8f8f8") self.log_text.pack(fill="both", expand=True, padx=10, pady=10) self.log_text.config(state=tk.DISABLED) # 示例函数体 self.example_code = """static void Diag21_PID_C9(U1 u1_a_num) { U1 u1_t_cmplt; U1 u1_t_cnt; struct SensorData sensor; U2 u2_array[10][20]; if((U1)DIAG_CNT_ZERO == u1_t_swrstcnt) /* Determine if a software reset is in progress */ { for(u1_t_cnt = (U1)DIAG21_ZERO; u1_t_cnt < (U1)DIAG21_PIDC9_FLAG; u1_t_cnt ++) { u1_t_cmplt = u1_g_InspSoftwareVersion(u4_g_cmd, &u4_g_data, (U1)TRUE); } vd_s_Diag21_U2ToU1(u2_g_buf, u1_g_data, (U1)DIAG21_PIDC9_FLAG); } else { /* Do Nothing */ } }""" # 加载示例 self.load_example() def setup_logging(self): """配置日志系统""" self.log_filename = f"parser_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log" # 创建文件处理器 file_handler = logging.FileHandler(self.log_filename, encoding='utf-8') file_handler.setLevel(logging.INFO) file_handler.setFormatter(logging.Formatter("%(asctime)s - %(levelname)s - %(message)s")) # 配置根日志器 root_logger = logging.getLogger() root_logger.setLevel(logging.INFO) root_logger.addHandler(file_handler) def log_to_gui(self, message, level="info"): """将日志信息显示在GUI中""" try: self.log_text.config(state=tk.NORMAL) timestamp = datetime.now().strftime("%H:%M:%S") self.log_text.insert(tk.END, f"[{timestamp}] {message}\n") self.log_text.see(tk.END) self.log_text.config(state=tk.DISABLED) if level == "info": logging.info(message) elif level == "warning": logging.warning(message) elif level == "error": logging.error(message) except Exception as e: logging.error(f"GUI日志错误: {str(e)}") def update_progress(self, value): """更新进度条""" self.progress['value'] = value self.root.update_idletasks() def load_example(self): """加载示例函数体""" self.input_text.delete(1.0, tk.END) self.input_text.insert(tk.END, self.example_code) self.log_to_gui("已加载示例函数体") def parse_function(self): try: code = self.input_text.get(1.0, tk.END) if not code.strip(): self.log_to_gui("错误: 没有输入函数体", "error") messagebox.showerror("错误", "请输入要解析的C语言函数体") return self.log_to_gui("开始解析函数体...") self.output_text.config(state=tk.NORMAL) self.output_text.delete(1.0, tk.END) self.update_progress(0) # 使用改进后的词法分析器 self.log_to_gui("执行词法分析...") lexer = SimpleCLexer() tokens = lexer.tokenize(code) self.update_progress(30) # 使用改进后的语法分析器 self.log_to_gui("执行语法分析...") analyzer = EnhancedFunctionAnalyzer() # 使用改进的分析器 analyzer.analyze(tokens) # 显示结果 self.log_to_gui("生成解析报告...") self.display_results( analyzer.local_vars, analyzer.global_vars, analyzer.function_calls, analyzer.function_name, analyzer.parameters ) self.update_progress(100) self.output_text.config(state=tk.DISABLED) self.log_to_gui("解析完成!") messagebox.showinfo("完成", "函数体解析成功完成!") except Exception as e: self.log_to_gui(f"解析错误: {str(e)}", "error") self.log_to_gui(f"错误详情: {traceback.format_exc()}", "error") messagebox.showerror("解析错误", f"发生错误:\n{str(e)}") self.update_progress(0) def display_results(self, local_vars, global_vars, function_calls, func_name, func_params): """增强版结果显示""" # 显示函数签名 self.output_text.insert(tk.END, "=== 函数签名 ===\n", "header") if func_name: self.output_text.insert(tk.END, f"函数名: {func_name}\n") if func_params: param_list = [] for param in func_params: param_list.append(f"{param['type']} {param['name']}") self.output_text.insert(tk.END, f"参数: {', '.join(param_list)}\n\n") else: self.output_text.insert(tk.END, "参数: 无\n\n") else: self.output_text.insert(tk.END, "警告: 无法识别函数签名\n\n") # 显示所有找到的变量 self.output_text.insert(tk.END, "=== 变量分析 ===\n", "header") self.output_text.insert(tk.END, "类型 | 名称 | 作用域 | 行号 | 类别\n", "subheader") self.output_text.insert(tk.END, "-" * 60 + "\n") # 显示参数 for param in func_params: self.output_text.insert(tk.END, f"参数 | {param['name']} | 参数 | {param['line']} | 基本类型\n") # 显示局部变量 for var in local_vars: category = "结构体" if var.get('is_struct', False) else "基本类型" struct_type = var.get('struct_type', '') if struct_type: category = f"结构体({struct_type})" self.output_text.insert(tk.END, f"变量 | {var['name']} | 局部 | {var['line']} | {category}\n") # 显示全局变量 for var in global_vars: category = "结构体" if var.get('is_struct', False) else "基本类型" struct_type = var.get('struct_type', '') if struct_type: category = f"结构体({struct_type})" self.output_text.insert(tk.END, f"变量 | {var['name']} | 全局 | {var['line']} | {category}\n") # 显示函数调用 for func in function_calls: self.output_text.insert(tk.END, f"函数调用 | {func['name']} | 调用 | {func['line']} | 函数\n") self.output_text.insert(tk.END, "\n") # 显示局部变量详情 if local_vars: self.output_text.insert(tk.END, "=== 局部变量详情 ===\n", "header") # 基本类型局部变量 basic_locals = [v for v in local_vars if not v.get('is_struct', False)] if basic_locals: self.output_text.insert(tk.END, "基本类型变量:\n") for var in basic_locals: self.output_text.insert(tk.END, f"{var['type']} {var['name']} (行号: {var['line']})\n") # 结构体局部变量 struct_locals = [v for v in local_vars if v.get('is_struct', False)] if struct_locals: self.output_text.insert(tk.END, "\n结构体变量:\n") for var in struct_locals: struct_type = var.get('struct_type', '未知结构体') self.output_text.insert(tk.END, f"{var['type']} {var['name']} (类型: {struct_type}, 行号: {var['line']})\n") else: self.output_text.insert(tk.END, "未找到局部变量\n\n") # 显示使用的全局变量 if global_vars: self.output_text.insert(tk.END, "=== 使用的全局变量 ===\n", "header") # 基本类型全局变量 basic_globals = [v for v in global_vars if not v.get('is_struct', False)] if basic_globals: self.output_text.insert(tk.END, "基本类型变量:\n") for var in basic_globals: self.output_text.insert(tk.END, f"{var['name']} (行号: {var['line']})\n") # 结构体全局变量 struct_globals = [v for v in global_vars if v.get('is_struct', False)] if struct_globals: self.output_text.insert(tk.END, "\n结构体变量:\n") for var in struct_globals: struct_type = var.get('struct_type', '未知结构体') self.output_text.insert(tk.END, f"{var['name']} (类型: {struct_type}, 行号: {var['line']})\n") self.output_text.insert(tk.END, "\n") else: self.output_text.insert(tk.END, "未使用全局变量\n\n") # 显示函数调用详情 if function_calls: self.output_text.insert(tk.END, "=== 函数调用详情 ===\n", "header") for func in function_calls: self.output_text.insert(tk.END, f"函数名: {func['name']} (行号: {func['line']})\n") self.output_text.insert(tk.END, f"返回类型: {func['return_type']}\n") self.output_text.insert(tk.END, f"参数: {func['params']}\n") self.output_text.insert(tk.END, "-" * 50 + "\n") else: self.output_text.insert(tk.END, "未调用任何函数\n\n") # 添加解析统计 self.output_text.insert(tk.END, "=== 解析统计 ===\n", "header") self.output_text.insert(tk.END, f"参数数量: {len(func_params)}\n") self.output_text.insert(tk.END, f"局部变量数量: {len(local_vars)}\n") self.output_text.insert(tk.END, f"全局变量数量: {len(global_vars)}\n") self.output_text.insert(tk.END, f"函数调用数量: {len(function_calls)}\n") # 结构体统计 struct_locals = [v for v in local_vars if v.get('is_struct', False)] struct_globals = [v for v in global_vars if v.get('is_struct', False)] self.output_text.insert(tk.END, f"结构体变量数量: {len(struct_locals) + len(struct_globals)}\n") self.output_text.insert(tk.END, f"总变量数量: {len(func_params) + len(local_vars) + len(global_vars) + len(function_calls)}\n") # 配置标签样式 self.output_text.tag_config("header", font=("Arial", 12, "bold"), foreground="#2c3e50") self.output_text.tag_config("subheader", font=("Arial", 10, "bold"), foreground="#34495e") if __name__ == "__main__": root = tk.Tk() app = FunctionParserApp(root) root.mainloop() 开始执行词法分析时会卡住,请解决此问题

nstall/yolov8/lib/libYolov8_linker_server.so [component_container-1] [INFO] [1748290289.190410638] [camera.camera_container]: Found class: rclcpp_components::NodeFactoryTemplate [component_container-1] [INFO] [1748290289.190478748] [camera.camera_container]: Instantiate class: rclcpp_components::NodeFactoryTemplate [component_container-1] terminate called after throwing an instance of 'cv::Exception' [component_container-1] what(): OpenCV(4.9.0) /home/jack/Opencv/opencv-4.9.0/modules/core/src/matrix.cpp:246: error: (-215:Assertion failed) s >= 0 in function 'setSize' [component_container-1] [component_container-1] Received signal: 6 [component_container-1] Log crash stack trace to Log/camera_crash_stack_trace_2025_05_27_04_11_29.log [component_container-1] terminate called recursively [component_container-1] Received signal: 6 [component_container-1] Log crash stack trace to Log/camera_crash_stack_trace_2025_05_27_04_11_29.log [component_container-1] Received signal: 11 [component_container-1] Log crash stack trace to Log/camera_crash_stack_trace_2025_05_27_04_11_29.log [component_container-1] Received signal: 6 [component_container-1] malloc(): unsorted double linked list corrupted [component_container-1] terminate called recursively [component_container-1] Received signal: 6 [component_container-1] Log crash stack trace to Log/camera_crash_stack_trace_2025_05_27_04_11_29.log [ERROR] [component_container-1]: process has died [pid 101715, exit code 6, cmd '/opt/ros/humble/lib/rclcpp_components/component_container --ros-args -r __node:=camera_container -r __ns:=/camera']. void Yolov8LinkerServer::color_depth_callback(const sensor_msgs::msg::Image::ConstSharedPtr &color_msg, const sensor_msgs::msg::Image::ConstSharedPtr &depth_msg) { // cv_bridge::CvImagePtr cv_ptr; try { // cv_ptr = cv_bridge::toCvCopy(color_msg, "bgr8"); // RCLCPP_INFO(this->get_logger(), // "Received image: %dx%d", // cv_ptr->image.cols, cv_ptr->image.rows); // 添加日志 // RCLCPP_INFO(this->get_logger(), "%s已经加载", "hhhhhhhhhhhhh"); // // 转换图像格式 cv::Mat color_image = cv_bridge::toCvShare(color_msg, "bgr8")->image; // cv::Mat depth_image = cv_bridge::toCvShare(depth_msg, "bgr8")->image; // 创建一个名为"beautiful"的窗口 namedWindow("beautiful", WINDOW_NORMAL); // 在窗口中显示图像 imshow("beautiful", color_image); // 等待6000毫秒后窗口自动关闭 waitKey(5); // 运行模型推理 // auto results = model.predict_once(color_image); // 需适配现有推理代码 // // 提取关键点并转换3D坐标 // current_points_.clear(); // for (const auto &kp : results[0].keypoints) // { // 假设使用第一个检测结果 // int u = static_cast<int>(kp[0]); // int v = static_cast<int>(kp[1]); // float depth = depth_image.at<uint16_t>(v, u) / 1000.0f; // 假设深度单位为毫米 // if (depth > 0) // { // cv::Point3f pt = pixel_to_3d(u, v, depth); // current_points_.push_back(pt); // } // } } catch (cv_bridge::Exception &e) { RCLCPP_ERROR(rclcpp::get_logger("object_3d_detector"), "Image conversion error: %s", e.what()); } }

最新推荐

recommend-type

tika-parser-font-module-3.1.0.jar中文-英文对照文档.zip

1、压缩文件中包含: 中文-英文对照文档、jar包下载地址、Maven依赖、Gradle依赖、源代码下载地址。 2、使用方法: 解压最外层zip,再解压其中的zip包,双击 【index.html】 文件,即可用浏览器打开、进行查看。 3、特殊说明: (1)本文档为人性化翻译,精心制作,请放心使用; (2)只翻译了该翻译的内容,如:注释、说明、描述、用法讲解 等; (3)不该翻译的内容保持原样,如:类名、方法名、包名、类型、关键字、代码 等。 4、温馨提示: (1)为了防止解压后路径太长导致浏览器无法打开,推荐在解压时选择“解压到当前文件夹”(放心,自带文件夹,文件不会散落一地); (2)有时,一套Java组件会有多个jar,所以在下载前,请仔细阅读本篇描述,以确保这就是你需要的文件。 5、本文件关键字: jar中文-英文对照文档.zip,java,jar包,Maven,第三方jar包,组件,开源组件,第三方组件,Gradle,中文API文档,手册,开发手册,使用手册,参考手册。
recommend-type

perl-SelfLoader-1.23-420.el8.tar.gz

# 适用操作系统:Centos8 #Step1、解压 tar -zxvf xxx.el8.tar.gz #Step2、进入解压后的目录,执行安装 sudo rpm -ivh *.rpm
recommend-type

tika-parser-audiovideo-module-3.1.0.jar中文-英文对照文档.zip

1、压缩文件中包含: 中文-英文对照文档、jar包下载地址、Maven依赖、Gradle依赖、源代码下载地址。 2、使用方法: 解压最外层zip,再解压其中的zip包,双击 【index.html】 文件,即可用浏览器打开、进行查看。 3、特殊说明: (1)本文档为人性化翻译,精心制作,请放心使用; (2)只翻译了该翻译的内容,如:注释、说明、描述、用法讲解 等; (3)不该翻译的内容保持原样,如:类名、方法名、包名、类型、关键字、代码 等。 4、温馨提示: (1)为了防止解压后路径太长导致浏览器无法打开,推荐在解压时选择“解压到当前文件夹”(放心,自带文件夹,文件不会散落一地); (2)有时,一套Java组件会有多个jar,所以在下载前,请仔细阅读本篇描述,以确保这就是你需要的文件。 5、本文件关键字: jar中文-英文对照文档.zip,java,jar包,Maven,第三方jar包,组件,开源组件,第三方组件,Gradle,中文API文档,手册,开发手册,使用手册,参考手册。
recommend-type

HTML时间格式化工具及测试页面介绍

标题 "BoolStudio.github.io" 暗示这是一个与GitHub相关的在线资源,具体来说是与BoolStudio相关的网页地址。GitHub是一个著名的代码托管平台,它支持Git版本控制系统,允许用户在云端存储和共享代码。BoolStudio可能是GitHub上的一个用户或组织账户名称,而该页面可能是他们托管的项目或个人页面的入口。 描述中的信息包含了HTML元素和JavaScript代码片段。这段描述展示了一个测试页文件的部分代码,涉及到HTML的标题(title)和内嵌框架(iframe)的使用,以及JavaScript中Date对象的扩展功能。 从描述中我们可以分析出以下知识点: 1. HTML标题(Title): 在HTML中,`<title>`标签用于定义网页的标题,它会显示在浏览器的标题栏或页面的标签上。在描述中出现了`<title>现在时间</title>`,这表明网页的标题被设置为了“现在时间”。 2. 微软时间: 这可能指的是在网页中嵌入微软产品的日期和时间显示。尽管这部分内容在描述中被删除了,但微软时间通常与Windows操作系统的日期和时间显示相关联。 3. iframe元素: `<iframe>`标签定义了一个内嵌框架,可以在网页中嵌入另一个文档。在描述中出现的是`<iframe src"></iframe>`,这表示创建了一个空的iframe元素,其src属性为空,实际上没有嵌入任何内容。通常src属性会被设置为另一个HTML文档的URL,用来在当前页面中显示外部页面的内容。 4. JavaScript日期格式化: 描述中包含了一段JavaScript代码,这段代码扩展了Date对象的功能,允许它根据提供的格式字符串(fmt)返回格式化的日期和时间。例如,如果fmt是'y年M月d日 h时m分s秒',则该函数会按照这个格式返回当前日期和时间。 具体到代码实现,以下步骤展示了如何在JavaScript中扩展Date对象并格式化日期: - 首先创建了一个对象o,该对象包含日期和时间的不同部分,例如年(y)、月(M)、日(d)、时(h)、分(m)、秒(s)。 - 使用正则表达式检查格式字符串fmt中是否包含年份的占位符(y+),如果存在则将其替换为四位数的年份,不足部分用0补齐。 - 使用for...in循环遍历对象o中的每一个键值对,并使用正则表达式测试这些键(如年、月、日等)是否在格式字符串中存在。如果存在,则替换为相应的日期时间值。 这个知识点展示了在JavaScript中如何自定义对象的方法,以及如何使用正则表达式进行字符串的搜索和替换。 标签 "HTML" 表明这个文件主要涉及到超文本标记语言(HTML),这是构建网页内容的标准标记语言,它允许开发者使用标签来定义网页的结构和内容。HTML文件通常包含各种元素,如段落、链接、图片、表格和表单等。 压缩包子文件的文件名称列表中的 "BoolStudio.github.io-main" 可能指的是一个包含 "BoolStudio.github.io" 网站主要文件的压缩包。通常,网站项目会包含多个文件和资源,为了便于管理和分发,开发者会将这些文件打包成一个压缩文件。在本例中,"main" 可能表示该压缩包包含了网站的主要文件和代码。 通过对给定文件信息的分析,我们可以了解到网页设计和开发的相关知识点,特别是HTML文档结构、JavaScript对象和函数扩展以及如何在网站项目中组织文件。这些知识对于前端开发者来说是非常重要的,能够帮助他们创建更加动态和功能丰富的网页应用。
recommend-type

Elixir测试:从用例执行到覆盖率分析

### Elixir 测试:从用例执行到覆盖率分析 #### 1. 测试用例执行 在测试中,以 `Test` 结尾且包含 `use ExUnit` 的模块被视为测试用例。测试用例本质上是设置回调和测试的集合。测试用例的执行方式有两种:并发执行或顺序执行。使用 `async: true` 选项的测试用例将并发执行,其余的则按定义顺序(默认随机)顺序执行。 测试用例的执行流程如下: 1. 执行所有 `setup_all` 回调。 2. 执行所有测试及其特定的设置和拆卸回调。 `setup_all` 回调按定义顺序在同一进程中执行。在运行任何测试之前,测试用例将生成一个进程,在该进程中运行所有
recommend-type

Android Studio 时间延时

<think> 我们参考多个引用,总结在Android Studio中实现时间延时的方法。根据引用[1]和[2],我们知道有几种常见的延时方法,包括使用Handler的postDelayed方法、使用Timer和TimerTask以及使用Thread.sleep。引用[3]和[5]则展示了具体代码示例。 具体方法如下: 1. **使用Handler的postDelayed方法**(推荐在主线程中使用,避免直接操作UI线程的问题): ```java new Handler().postDelayed(new Runnable() { @Override
recommend-type

IMS Open Corpus Workbench:打造高效大型文本语料库管理工具

IMS Open Corpus Workbench(以下简称CWB)是一个强大的开源工具集,它专门用于管理和查询大型的、带有语言注释的文本语料库。这项工具有着广泛的应用领域,包括语言学研究、自然语言处理、人文科学研究等。 ### 标题知识点: #### 大型文本语料库的索引和查询工具 大型文本语料库指的是含有大量文本数据的数据库,其中包含的文本量通常以百万计。这些数据可能是书面文本、口语录音文字转写等形式。对于如此庞大的数据集,索引是必要的,它可以帮助研究者快速定位到感兴趣的片段,而查询工具则提供了从这些大量数据中提取特定信息的能力。 #### 开源 CWB作为一个开源工具,意味着其源代码对所有人开放,并且可以免费使用和修改。开源项目通常是由社区驱动,有着活跃的开发者和用户群体,不断对工具进行改进和拓展。这种模式促进了创新,并且有利于长期维护和升级。 ### 描述知识点: #### 管理和查询带有语言注释的文本 在语料库中,文本数据经常会被加上各种形式的语言注释,比如句法结构、词性标注、语义角色等。CWB支持管理这类富含语言信息的语料库,使其不仅仅保存原始文本信息,还整合了深层的语言知识。此外,CWB提供了多种查询语言注释数据的方式,使得用户可以针对特定的注释信息进行精确查询。 #### 核心组件:CQP(Corpus Query Processor) CQP是CWB中的核心组件,是一个高度灵活和高效的查询处理器。它支持在终端会话中交互式地使用,这为熟悉命令行界面的用户提供了一个强大的工具。同时,CQP也可以嵌入到其他程序中,比如Perl脚本,从而提供编程式的语料库访问方式。这为高级用户提供了一个强大的平台,可以编写复杂的查询,并将查询结果集成到其他程序中。 #### 基于Web的GUI CQPweb 除了命令行界面外,CWB还提供了一个基于Web的图形用户界面CQPweb,使得不熟悉命令行的用户也能够方便地使用CWB的强大功能。CQPweb通常允许用户通过网页直接构建查询,并展示查询结果,极大地降低了使用门槛。 ### 标签知识点: #### 开源软件 CWB作为开源软件,其主要特点和优势包括: - **社区支持**:开放源代码鼓励了全球开发者共同参与,提供错误修正、功能增强、新特性开发等。 - **定制化**:用户可以根据自己的需求对源代码进行修改,从而实现定制化的功能。 - **透明性**:源代码的开放确保了软件工作的透明性,用户可以清楚了解软件的工作原理和数据处理方式。 - **可靠性**:由于代码的公开性,很多用户和开发者可以共同审查代码,提高了软件的可靠性和安全性。 - **成本效益**:开源软件通常不需要支付昂贵的许可费用,对预算有限的个人和机构特别友好。 ### 压缩包子文件的文件名称列表知识点: #### cwb-3.0.0-osx-10.5-universal 这个文件名提供了关于该软件包的重要信息: - **cwb**:表示这是IMS Open Corpus Workbench的软件包。 - **3.0.0**:表示这个包的版本号,了解版本信息对于获取支持、查看更新日志、了解新特性等方面很重要。 - **osx**:表示这个软件包是为Mac OS X操作系统设计的。 - **10.5**:这个数字指明了这个软件包支持的操作系统版本至少是Mac OS X 10.5。 - **universal**:表明这个软件包是为不同架构的处理器(比如32位和64位)设计的通用二进制文件,提高了软件包的兼容性和可移植性。 综上所述,IMS Open Corpus Workbench是一个为处理带有语言注释的大型文本语料库而设计的开源工具集,它以高效且灵活的查询处理器CQP为核心,提供了命令行和基于Web的两种交互方式,极大地促进了语言学和语言技术领域的研究与应用。由于其开源特性,CWB得到了广泛的使用和不断的改进。
recommend-type

基于属性测试的深入解析与策略探讨

### 基于属性测试的深入解析与策略探讨 #### 1. 基于属性测试中的收缩机制 在基于属性的测试中,当测试失败时,像 `stream_data` 这样的框架会执行收缩(Shrinking)操作。收缩的目的是简化导致测试失败的输入,同时确保简化后的输入仍然会使测试失败,这样能更方便地定位问题。 为了说明这一点,我们来看一个简单的排序函数测试示例。我们实现了一个糟糕的排序函数,实际上就是恒等函数,它只是原封不动地返回输入列表: ```elixir defmodule BadSortTest do use ExUnit.Case use ExUnitProperties pro
recommend-type

ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key && \

<think> 根据用户的问题要求,我需要详细解释命令`ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key`的含义、后续操作、常见问题及解决办法。同时,我需要参考提供的引用内容,并遵循系统级指令(如LaTeX格式、中文回答、生成相关问题等)。 首先,分析命令的组成部分: - `ssh-keygen`:生成、管理和转换SSH密钥的工具。 - `-t rsa`:指定密钥类型为RSA算法。 - `-f /etc/ssh/ssh_host_rsa_key`:指定生成的私钥文件路径为`/etc/ssh/ssh_host_rsa_key`。对应的公钥文件会在同一
recommend-type

挑战性开源平台游戏YAGAC:无故事忍者冒险

Y.A.G.A.C(Yet Another Great Arcade Classic),是一款免费的开源伪2D平台游戏,它在传统平台游戏的基础上融入了3D游戏元素,让玩家在一个拥有2D精灵的3D环境中进行操作。YAGAC游戏以其高挑战性和上瘾性而著称,吸引了众多游戏爱好者和编程高手的关注。 首先,让我们深入了解这款游戏的核心玩法。YAGAC的最大特点是在一个基本的2D平面内,玩家可以控制角色进行运动,但游戏环境却是3D制作的。这种设计为玩家带来了全新的视觉体验和操作感受。在YAGAC中,玩家扮演的是一个身手敏捷的忍者,任务是在错综复杂的地牢中生存下来,地牢充满了各种陷阱和敌人,如机器人等。为了逃生,玩家需要在各种关卡中寻找隐藏的彩球,这些彩球决定了玩家能够到达的区域范围。 在游戏过程中,收集到的彩球会改变对应颜色平台的属性,使原本脆弱的平台变得牢固,从而为玩家打开新的道路。这样的设计不仅考验玩家的反应和速度,还考验他们的策略和记忆能力。YAGAC的游戏关卡设计非常巧妙,经常需要玩家反复尝试,每一次尝试都可能发现新的线索和策略,这样的设计增加了游戏的重复可玩性。 YAGAC使用的引擎在游戏流畅性方面表现出色,这也是游戏的一大强项。一款游戏引擎的强大与否直接关系到游戏体验的好坏,YAGAC的开发团队选择或者开发了一个能够高效处理3D图形和2D动作的引擎,确保了游戏在各种配置的计算机上都能保持良好的运行状态和响应速度。 接下来,我们来探讨YAGAC的开源属性。由于YAGAC是开源的,这意味着游戏的源代码是开放的,任何个人或组织都可以访问、修改并重新分发该软件。开源软件通常由社区维护,并且鼓励用户贡献代码,共同改进游戏。对于像YAGAC这样的游戏来说,开源可以吸引更多的开发者参与进来,共同完善游戏体验。玩家和开发者可以对游戏进行本地化、修改游戏机制,甚至是增加新的内容和关卡。 开源平台游戏的概念不仅限于提供免费的游戏体验,它还鼓励创新和协作。开源文化促进知识共享和技术进步,这在软件开发领域尤为重要。同时,玩家也可以从中学到编程知识,因为游戏的开源性质允许用户查看和学习游戏背后的代码。 最后,关于YAGAC的文件名称“yagac”,这很可能是游戏的主程序文件名或者主目录名。当玩家在计算机上进行下载和安装时,可能会遇到这样的文件名,识别这一点有助于玩家快速定位和管理游戏文件。 总的来说,Y.A.G.A.C.作为一款开源平台游戏,在游戏玩法设计、开源文化和3D游戏引擎的流畅性等方面,都显示出其独特的优势。游戏的挑战性和上瘾性,以及社区驱动的开发模式,都为游戏爱好者和开源爱好者提供了一个理想的游戏和学习平台。对于对游戏开发感兴趣的玩家来说,YAGAC无疑是一个研究和实践的良好起点。