用 file_utils.py: def export_cosyvoice2_vllm(model, model_path, device): 导出pytorch模型为VLLM完成,生成了vllm目录下的 safetensor, config.json, generation_config.json 三个文件。但是后边加载这个vllm时报错:
self.llm.vllm = LLMEngine.from_enging_args(engine_args)
file /usr/local/python3/lib/python3.12/site-packages/vllm/engine/llm_engine.py , line 447, in from_vllm_config
return cls(
File "/usr/local/python3/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 265, in __init__
self.model_executor = executor_class(vllm_config=vllm_config)
File "/usr/lcoal/python3/lib/python3.12/site-packages/vllm/executor/executor_base.py", line 52, in __init__
self.__init__executor()
File "/usr/localpython3/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py" line 47, in _init_executor
self.collective_rpc("load_model")
File "/usr/localpython3/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py" line 56, in collective_rpc
answer = run_method(self.driver_worker, method, args, kwargs)
File "/usr/localpython3/lib/python3.12/site-packages/vllm/utils.py" line 2605, in run_method
return func(*args, **kwargs)
File ""/usr/localpython3/lib/python3.12/site-packages/vllm/worker/worker.py" line 207, in load_model
self.model_runner.load_model()
File "/usr/localpython3/lib/python3.12/site-packages/vllm/worker/model_runner.py", line 1173, in load_model
self.model=get_model(vllm_config=self.vllm_config)
File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/__init__.py" line 58, in get_model
return loader.load_model(vllm_conffig=vllm_config,
File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/defaulet_loader.py" line 277, in load_model
loaded_weights = model.load_weights(
File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/models/qwen2.py" line 501, in load_weights
return loader.load_weights(weights)
File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/models/utils.py" line 277 in load_weights
autoloaded_weights = set(self._load_module("", self.module, weights))
File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/models/utils.py", line 235
yield from self._load_module(prefix,
File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/models/utils.py", line 244, in _load_module
yield from self._load_aram(prefix, child_params[child_prefix],
File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/models/utils.py", line 167, in _load_param
weight_loader(param, weight_data)
File ""/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py", line 386, in weight_loader
assert loaded_weight.shape[output_dim] == sself.org_vocab_size
AssertionError
vllm版本0.9.0, CosyVoice2; 改动点,不适用eos_token和file_token, 所以总token size=vocab_size +1 而不是 +3, 请问应该在哪里做修改可以适配到Vllm模型?
用 file_utils.py: def export_cosyvoice2_vllm(model, model_path, device): 导出pytorch模型为VLLM完成,生成了vllm目录下的 safetensor, config.json, generation_config.json 三个文件。但是后边加载这个vllm时报错:
vllm版本0.9.0, CosyVoice2; 改动点,不适用eos_token和file_token, 所以总token size=vocab_size +1 而不是 +3, 请问应该在哪里做修改可以适配到Vllm模型?