先下载llama.cpp
git clone https://github.com/siminzheng/RAG-LangChain.git
然后安装以来:
pip install -r requirements.txt
将一个精度为fp16的huggingface格式模型转化为gguf格式模型:
python convert_hf_to_gguf.py /root/autodl-tmp/llama3/LLM-Research/Meta-Llama-3-8B-Instruct-merged --outtype q8_0 --verbose --outfile /root/autodl-tmp/llama3/LLM-Research/Meta-Llama-3-8B-Instruct-merged-gguf8.gguf
附录: