xTuring项目：如何为开源大语言模型库添加新模型-CSDN博客

本文链接：https://blog.csdn.net/gitblog_00477/article/details/148784608

xTuring项目：如何为开源大语言模型库添加新模型

前言

xTuring作为一个开源的大语言模型库，其核心价值在于为研究者和开发者提供便捷的模型微调和推理能力。本文将详细介绍如何为xTuring项目贡献新的模型支持，帮助开发者扩展xTuring的模型生态。

准备工作

在开始添加新模型前，需要确保具备以下技术基础：

熟悉PyTorch深度学习框架
了解Hugging Face Transformers库的使用
理解xTuring项目的基本架构，特别是models和engines目录结构

标准模型添加流程

1. 创建引擎(Engine)

引擎是xTuring中负责模型底层操作的核心组件。添加新模型首先需要创建对应的引擎类：

from xturing.engines.causal import CausalEngine

class MyEngine(CausalEngine):
    config_name: str = "my_engine"

    def __init__(self, model_name: str, weights_path=None):
        super().__init__(model_name, weights_path)

关键点说明：

必须继承合适的基类（如CausalEngine）
config_name是引擎的唯一标识符
初始化方法需要接收模型名称和权重路径参数

2. 创建模型(Model)

模型类是用户直接交互的接口层：

from xturing.models.causal import CausalModel
from xturing.engines.my_engine import MyEngine

class MyModel(CausalModel):
    config_name: str = "my_model"

    def __init__(self, weights_path=None):
        super().__init__(MyEngine.config_name, weights_path)

模型类需要：

继承对应的基模型类
指定config_name作为模型标识
在初始化时传入引擎配置名

3. 注册模型和引擎

完成类和文件创建后，需要在项目中进行注册：

在xturing/models/__init__.py中添加模型导入和注册
在xturing/engines/__init__.py中添加引擎导入和注册

4. 配置超参数

在config目录下的配置文件中添加模型的默认参数：

# finetuning_config.yaml
my_model:
    learning_rate: 1e-4
    weight_decay: 0.01
    num_train_epochs: 3
    batch_size: 8
    max_length: 256

5. 测试验证

建议添加以下测试内容：

模型加载测试
前向传播测试
微调流程测试
推理生成测试

LoRA模型添加指南

LoRA（Low-Rank Adaptation）是一种高效的微调技术，xTuring提供了专门的支持：

1. 创建LoRA引擎

from xturing.engines.causal import CausalLoraEngine

class MyLoraEngine(CausalLoraEngine):
    config_name: str = "my_engine_lora"

    def __init__(self, weights_path=None):
        super().__init__(
            model_name,
            weights_path,
            target_modules=["q_proj", "v_proj"],  # 模型特定的注意力层标识
        )

2. 创建LoRA模型

from xturing.models.causal import CausalLoraModel
from xturing.engines.my_engine import MyLoraEngine

class MyModelLora(CausalLoraModel):
    config_name: str = "my_model_lora"

    def __init__(self, weights_path=None):
        super().__init__(MyLoraEngine.config_name, weights_path)

关键区别：