最近做Gragh RAG 基于 Nebulau图数据库时,遇到访问openaipublic.blob.core.windows.net 下载分词模型的情况,但是因为服务器屏蔽的端口。导致链接被终止,测试程序跑失败。
下面是处理日子,作为后续查找。
mv cl100k_base.tiktoken cab1ac2de16cad507ac799535b7bc69471851e24
##将上面文件重命名,用于本地加载
##参考地址:https://stackoverflow.com/questions/76106366/how-to-use-tiktoken-in-offline-mode-computer
python 验证过程日志:
(/data/conda_env/nebula) root@ubuntu:/data/ghf/KG/nebula/examples/A_get_started# python
Python 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tiktoken_ext.openai_public
>>> import inspect
>>> print(dir(tiktoken_ext.openai_public))
['ENCODING_CONSTRUCTORS', 'ENDOFPROMPT', 'ENDOFTEXT', 'FIM_MIDDLE', 'FIM_PREFIX', 'FIM_SUFFIX', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__