本地私有化部署大模型,主流的工具是 Ollama。
使用以下指令部署:
curl -fsSL https://ollama.com/install.sh | sh
但是笔者本地报错了,查下gitbub 手动下载:
curl -L https://ollama.com/download/ollama-linux-amd64-rocm.tgz -o ollama-linux-amd64-rocm.tgz
sudo tar -C /usr -xzf ollama-linux-amd64-rocm.tgz
或者使用[三方镜像](https://docker.aityp.com/image/docker.io/ollama/ollama:rocm) rocm 是支持GPU的意思。
docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ollama/ollama:rocm
docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ollama/ollama:rocm docker.io/ollama/ollama:rocm
安装完毕执行:
sh-4.2# ollama run glm4
Error: llama runner process has terminated: exit status 127
由于日志比较少,打开docker的日志分析:
[GIN] 2024/11/15 - 07:41:49 | 200 | 283.749µs | 127.0.0.1 | HEAD "/"
[GIN] 2024/11/15 - 07:41:49 | 200 | 17.617406ms | 127.0.0.1 | POST "/api/show"
time=2024-11-15T07:41:49.430Z level=INFO source=sched.go:715 msg="new model will fit in available VRAM in single GPU, loading" model=/root/.ollama/models/blobs/sha256-b506a070d1152798d435ec4e7687336567ae653b3106f73b7b4ac7be1cbc4449 gpu