Skip to content

cosyvoice2 vllm 加载报错 #1906

Description

@JohnHerry

用 file_utils.py: def export_cosyvoice2_vllm(model, model_path, device): 导出pytorch模型为VLLM完成,生成了vllm目录下的 safetensor, config.json, generation_config.json 三个文件。但是后边加载这个vllm时报错:

   self.llm.vllm = LLMEngine.from_enging_args(engine_args)
   file /usr/local/python3/lib/python3.12/site-packages/vllm/engine/llm_engine.py , line 447, in from_vllm_config
        return cls(
   File "/usr/local/python3/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 265, in __init__
        self.model_executor = executor_class(vllm_config=vllm_config)
  File "/usr/lcoal/python3/lib/python3.12/site-packages/vllm/executor/executor_base.py", line 52, in __init__
        self.__init__executor()
  File "/usr/localpython3/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py"  line 47, in _init_executor
        self.collective_rpc("load_model")
  File "/usr/localpython3/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py"  line 56,  in collective_rpc
         answer = run_method(self.driver_worker, method, args, kwargs) 
 File "/usr/localpython3/lib/python3.12/site-packages/vllm/utils.py" line 2605, in run_method
        return func(*args, **kwargs)
 File ""/usr/localpython3/lib/python3.12/site-packages/vllm/worker/worker.py"  line 207, in load_model
        self.model_runner.load_model()
 File "/usr/localpython3/lib/python3.12/site-packages/vllm/worker/model_runner.py", line 1173, in load_model
       self.model=get_model(vllm_config=self.vllm_config)
 File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/__init__.py" line 58, in get_model
      return   loader.load_model(vllm_conffig=vllm_config,
  File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/defaulet_loader.py" line 277, in load_model
     loaded_weights = model.load_weights(
   File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/models/qwen2.py" line 501, in load_weights
     return loader.load_weights(weights)
  File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/models/utils.py" line 277 in load_weights
    autoloaded_weights = set(self._load_module("", self.module, weights))
   File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/models/utils.py", line 235
   yield from self._load_module(prefix,
  File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/models/utils.py", line 244, in _load_module
       yield from self._load_aram(prefix, child_params[child_prefix],
    File "/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/model_loader/models/utils.py", line 167, in _load_param
    weight_loader(param, weight_data)
   File ""/usr/localpython3/lib/python3.12/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py", line 386, in weight_loader
    assert loaded_weight.shape[output_dim] == sself.org_vocab_size
  AssertionError

vllm版本0.9.0, CosyVoice2; 改动点,不适用eos_token和file_token, 所以总token size=vocab_size +1 而不是 +3, 请问应该在哪里做修改可以适配到Vllm模型?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions