/usr/local/lib/python3.11/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
warnings.warn(
2025-07-04 10:45:43.819174: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-07-04 10:45:43.859491: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-07-04 10:45:44.671052: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/local/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Converting format of dataset (num_proc=16): 0%| | 0/91 [00:00<?, ? examples/s]
Converting format of dataset (num_proc=16): 77%|████████████████████████████████████████████████████████▏ | 70/91 [00:00<00:00, 687.79 examples/s]
Converting format of dataset (num_proc=16): 100%|█████████████████████████████████████████████████████████████████████████| 91/91 [00:00<00:00, 495.30 examples/s]
Generating train split: 0 examples [00:00, ? examples/s]
Generating train split: 1000 examples [00:00, 42466.65 examples/s]
Converting format of dataset (num_proc=16): 0%| | 0/1000 [00:00<?, ? examples/s]
Converting format of dataset (num_proc=16): 94%|████████████████████████████████████████████████████████████████▋ | 938/1000 [00:00<00:00, 8956.68 examples/s]
Converting format of dataset (num_proc=16): 100%|████████████████████████████████████████████████████████████████████| 1000/1000 [00:00<00:00, 5587.08 examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/1091 [00:00<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 6%|████▍ | 69/1091 [00:00<00:08, 113.69 examples/s]
Running tokenizer on dataset (num_proc=16): 13%|████████▊ | 138/1091 [00:00<00:04, 212.02 examples/s]
Running tokenizer on dataset (num_proc=16): 19%|█████████████▎ | 207/1091 [00:01<00:04, 217.80 examples/s]
Running tokenizer on dataset (num_proc=16): 25%|█████████████████▋ | 275/1091 [00:01<00:02, 276.83 examples/s]
Running tokenizer on dataset (num_proc=16): 38%|██████████████████████████▎ | 411/1091 [00:01<00:01, 368.09 examples/s]
Running tokenizer on dataset (num_proc=16): 44%|██████████████████████████████▋ | 479/1091 [00:01<00:01, 400.06 examples/s]
Running tokenizer on dataset (num_proc=16): 56%|███████████████████████████████████████▍ | 615/1091 [00:01<00:00, 545.20 examples/s]
Running tokenizer on dataset (num_proc=16): 63%|███████████████████████████████████████████▊ | 683/1091 [00:01<00:00, 533.73 examples/s]
Running tokenizer on dataset (num_proc=16): 69%|████████████████████████████████████████████████▏ | 751/1091 [00:01<00:00, 517.06 examples/s]
Running tokenizer on dataset (num_proc=16): 75%|████████████████████████████████████████████████████▌ | 819/1091 [00:02<00:00, 496.78 examples/s]
Running tokenizer on dataset (num_proc=16): 88%|█████████████████████████████████████████████████████████████▎ | 955/1091 [00:02<00:00, 625.17 examples/s]
Running tokenizer on dataset (num_proc=16): 94%|████████████████████████████████████████████████████████████████▋ | 1023/1091 [00:02<00:00, 606.56 examples/s]
Running tokenizer on dataset (num_proc=16): 100%|█████████████████████████████████████████████████████████████████████| 1091/1091 [00:02<00:00, 434.44 examples/s]
Traceback (most recent call last):
File "/usr/local/bin/llamafactory-cli", line 8, in <module>
sys.exit(main())
^^^^^^
File "/mnt/workspace/LLaMA-Factory/src/llamafactory/cli.py", line 151, in main
COMMAND_MAP[command]()
File "/mnt/workspace/LLaMA-Factory/src/llamafactory/train/tuner.py", line 110, in run_exp
_training_function(config={"args": args, "callbacks": callbacks})
File "/mnt/workspace/LLaMA-Factory/src/llamafactory/train/tuner.py", line 72, in _training_function
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/mnt/workspace/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 52, in run_sft
model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/workspace/LLaMA-Factory/src/llamafactory/model/loader.py", line 180, in load_model
model = load_class.from_pretrained(**init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 571, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/modeling_utils.py", line 309, in _wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4508, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 633, in __init__
self.model = Qwen2Model(config)
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 362, in __init__
self.post_init()
File "/usr/local/lib/python3.11/site-packages/transformers/modeling_utils.py", line 1969, in post_init
if v not in ALL_PARALLEL_STYLES:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: argument of type 'NoneType' is not iterable