基于PythonFlask的LLM模型API统一接口均衡调用系统，可快速添加任何LLM模型的API资源-CSDN下载

共16个文件

py：13个

md：2个

license：1个

52 浏览量 2025-08-25 18:59:51 上传评论收藏 22KB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

基于 Python Flask 的 LLM 模型 API 统一接口均衡调用系统，可快速添加任何 LLM 模型的 API 并设置对应的负载，系统会自动均衡负载，以实现免费 LLM API 的高效应用。.zip （16个子文件）

LLM-API-BalancedCall-main

app

__init__.py 488B

utils

__init__.py 0B

load_balancer.py 1KB

llm_manager.py 3KB

load_strategies.py 3KB

models

__init__.py 141B

base_model.py 3KB

zhipu_model.py 1KB

llm

__init__.py 0B

routes.py 5KB

LICENSE 1KB

wsgi.py 113B

README_EN.md 19KB

run.py 204B

README.md 18KB

config.py 321B

# LLM-API-BalancedCall [[English Readme]](https://github.com/LeoLeeYM/LLM-API-BalancedCall/blob/main/README_EN.md) 基于 Python Flask 的 LLM 模型 API 统一接口均衡调用系统，可快速添加任何 LLM 模型的 API 并设置对应的负载，系统会自动均衡负载，以实现免费 LLM API 的高效应用。通过本项目你可以快速完成 LLM 模型 API 的聚合接口，并且模块化的设计允许您快速的添加任意模型 API 并设置其负载和权重。项目结构 ``` ├── app/ │ ├── __init__.py │ ├── llm/ │ │ ├── __init__.py │ │ └── routes.py // 内建 API 接口 │ └── utils/ │ ├── __init__.py │ ├── llm_manager.py // 模型调用类 │ ├── load_balancer.py // 模型负载计算 | ├── load_strategies.py // 模型负载计算类型 │ └── models/ │ ├── __init__.py │ ├── base_model.py // 模型基础类 │ ├── zhipu_model.py // 智谱 GLM-4-Flash 模型 ├── config.py ├── wsgi.py ├── README.md ├── requirements.txt └── LICENSE ``` ## 目录 1. [快速上手指南](#1-快速上手指南) 2. [添加新模型教程](#2-添加新模型教程) 3. [高级配置：自定义负载策略](#3-高级配置自定义负载策略) 4. [生产部署建议](#4-生产部署建议) 5. [内建API接口文档](#5-内建API接口文档) --- ## 1. 快速上手指南 ### 1.1 安装运行以下是快速安装和运行本项目的步骤： ```bash # 克隆仓库 git clone https://github.com/LeoLeeYM/LLM-API-BalancedCall.git cd LLM-API-BalancedCall # 创建虚拟环境 python -m venv .venv source .venv/bin/activate # Linux/Mac # .venv\Scripts\activate # Windows # 安装依赖 pip install -r requirements.txt # 配置密钥（编辑config.py） nano config.py ``` `config.py` 格式如下： ```python class Config: # 智谱AI配置（带权重示例） ZHIPU_CONFIG = { # 智谱模型配置 'api_keys': [ # apiKey 集合 {'key': 'your api key', 'weight': 1.0} # 配置了 weight，影响选择该模型后 apiKey 的选择倾向 ], 'model_weight': 1, # 模型整体权重，影响模型选择倾向 'max_concurrency': 200 # 模型负载参数 } # 系统配置 DEBUG = True FLASK_ENV = 'development' ENABLED_MODELS = ['zhipu'] # 注册的模型 ``` ``` # 启动服务 python run.py ``` ### 1.2 验证服务在启动服务后，你可以通过以下命令验证服务是否正常运行： ```bash # 测试系统状态 curl http://localhost:9000/llm/system-capacity # 发送示例请求 curl -X POST http://localhost:9000/llm/chat \ -H "Content-Type: application/json" \ -d '{"messages":[{"role":"user","content":"你好"}]}' ``` ------ ## 2. 添加新模型教程 ### 2.1 创建模型类在 `app/utils/models/` 下新建文件（例如 `baidu_model.py`），并实现模型接口： ```python import requests from typing import Generator from .base_model import BaseModel from app.utils.load_strategies import ConcurrencyStrategy class BaiduModel(BaseModel): """ 百度文心大模型实现文档参考：https://cloud.baidu.com/doc/WENXINWORKSHOP/ """ # 必须配置项 ----------- STRATEGY_CLASS = ConcurrencyStrategy # 选择负载策略，此处选择内置最大并发策略 CONFIG_SECTION = 'BAIDU_CONFIG' # 对应配置段名称 supports_tools = False # 是否支持函数调用 # API常量配置 ---------- API_BASE = "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat" DEFAULT_MODEL = "eb-instant" # 默认模型版本 def _get_strategy_params(self): """从配置获取策略参数""" return { 'max_concurrency': self.config['max_concurrency'] } def chat_completion(self, messages, tools, api_key, stream=False): """ 核心API调用方法 :param messages: 消息列表 :param tools: 工具列表（本示例不支持） :param api_key: 当前选用的API密钥 :param stream: 是否流式传输 :return: 同步返回字符串，流式返回生成器 """ # 构造请求头 headers = { "Content-Type": "application/json", "Authorization": f"Bearer {api_key}" } # 构造请求体 payload = { "messages": messages, "stream": stream } try: if stream: return self._handle_stream_request(headers, payload) return self._handle_sync_request(headers, payload) except Exception as e: self._handle_error(e) # 私有方法 ----------- def _handle_sync_request(self, headers, payload): """处理同步请求""" response = requests.post( self.API_BASE, headers=headers, json=payload, timeout=30 ) response.raise_for_status() return response.json()['result'] def _handle_stream_request(self, headers, payload) -> Generator[str, None, None]: """处理流式请求""" with requests.post( self.API_BASE, headers=headers, json=payload, stream=True, timeout=60 ) as response: response.raise_for_status() for chunk in response.iter_content(chunk_size=None): if chunk: yield chunk.decode('utf-8') def _handle_error(self, error): """统一错误处理""" error_map = { requests.HTTPError: "API服务器返回错误", requests.Timeout: "请求超时", KeyError: "响应格式不符合预期" } raise error_map.get(type(error), "未知错误") from error ``` ### 2.2 配置参数修改 `config.py` 以配置 API 密钥和模型参数： ```python class Config: # 百度文心配置 BAIDU_CONFIG = { # API密钥配置（支持权重） 'api_keys': [ {'key': 'your_api_key_1', 'weight': 2.0}, # 高权重密钥 {'key': 'your_api_key_2', 'weight': 1.0} # 普通密钥 ], # 模型参数 'model_weight': 1.5, # 模型全局权重 'max_concurrency': 100, # 单密钥最大并发 # 可选高级参数 'default_temperature': 0.7, # 默认采样温度 'max_retries': 3 # 请求重试次数 } # 启用模型列表 ENABLED_MODELS = ['zhipu', 'baidu'] ``` #### 配置项说明 | 参数 | 类型 | 必需 | 说明 | | ------------------- | ----- | ---- | ------------------------------- | | api_keys | list | 是 | API密钥列表，支持字典格式带权重 | | model_weight | float | 否 | 模型全局权重（>=1提升优先级） | | max_concurrency | int | 否 | 单密钥最大并发请求数 | | default_temperature | float | 否 | 默认采样温度（0~1） | | max_retries | int | 否 | 请求失败重试次数 | ### 2.3 注册模型 #### 2.3.1 修改注册文件编辑 `app/utils/models/__init__.py` 注册模型： ```python from .baidu_model import BaiduModel MODEL_CLASSES = { 'baidu': BaiduModel, # 键名对应ENABLED_MODELS中的名称 # 其他模型... } ``` #### 2.3.2 注册验证测试注册是否成功： ```python # 测试注册是否成功 from app.utils.models import MODEL_CLASSES print(MODEL_CLASSES['baidu']) # 应输出 <class 'app.utils.models.baidu_model.BaiduModel'> ``` ### 2.4 实现API调用 #### 2.4.1 请求体构建根据模型要求调整 `_build

评论收藏

内容反馈