Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

无网络环境下启动已缓存的 SenseVoiceSmall 会报 fsmn-vad is not registered #2629

Closed
1 of 3 tasks
kimi360 opened this issue Dec 5, 2024 · 10 comments · Fixed by #2654
Closed
1 of 3 tasks
Labels
Milestone

Comments

@kimi360
Copy link

kimi360 commented Dec 5, 2024

System Info / 系統信息

操作系统:Centos7.9
docker镜像:xinference:latest-cpu
纯cpu启动镜像

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

Xinference:1.0.1

The command used to start Xinference / 用以启动 xinference 的命令

xinference launch --model-name SenseVoiceSmall --model-engine cpu --model-type audio

Reproduction / 复现过程

在模型已经缓存的情况下,无网络环境启动 SenseVoiceSmall ,启动过程中似乎会去访问 modelscope , 最终导致模型启动失败

[root@localhost xinference]# docker compose up 
[+] Running 2/2
 ✔ Network xinference_default  Created                                                                                                                                                   0.1s 
 ✔ Container xinference        Created                                                                                                                                                   0.1s 
Attaching to xinference
xinference  | 2024-12-03 08:56:59,018 xinference.core.supervisor 148 INFO     Xinference supervisor 0.0.0.0:53366 started
xinference  | 2024-12-03 08:56:59,038 xinference.core.worker 148 INFO     Starting metrics export server at 0.0.0.0:None
xinference  | 2024-12-03 08:56:59,040 xinference.core.worker 148 INFO     Checking metrics export server...
xinference  | 2024-12-03 08:57:00,635 xinference.core.worker 148 INFO     Metrics server is started at: http://0.0.0.0:43408
xinference  | 2024-12-03 08:57:00,635 xinference.core.worker 148 INFO     Purge cache directory: /root/.xinference/cache
xinference  | 2024-12-03 08:57:00,636 xinference.core.worker 148 INFO     Connected to supervisor as a fresh worker
xinference  | 2024-12-03 08:57:00,651 xinference.core.worker 148 INFO     Xinference worker 0.0.0.0:53366 started
xinference  | 2024-12-03 08:57:05,419 xinference.api.restful_api 7 INFO     Starting Xinference at endpoint: http://0.0.0.0:9997
xinference  | 2024-12-03 08:57:05,555 uvicorn.error 7 INFO     Uvicorn running on http://0.0.0.0:9997 (Press CTRL+C to quit)
xinference  | Launch model name: SenseVoiceSmall with kwargs: {}
xinference  | 2024-12-03 08:57:06,806 xinference.core.worker 148 INFO     [request 92aea694-b154-11ef-99b1-0242ac120002] Enter launch_builtin_model, args: <xinference.core.worker.WorkerActor
 object at 0x7f41cca367b0>, kwargs: model_uid=SenseVoiceSmall-0,model_name=SenseVoiceSmall,model_size_in_billions=None,model_format=None,quantization=None,model_engine=cpu,model_type=audio,n
_gpu=auto,request_limits=None,peft_model_config=None,gpu_idx=None,download_hub=None,model_path=None,trust_remote_code=True
xinference  | 2024-12-03 08:57:10,309 xinference.core.model 315 INFO     Start requests handler.
xinference  | WARNING:root:Key Conformer already exists in model_classes, re-register
xinference  | WARNING:root:Key Linear already exists in adaptor_classes, re-register
xinference  | WARNING:root:Key TransformerDecoder already exists in decoder_classes, re-register
xinference  | WARNING:root:Key LightweightConvolutionTransformerDecoder already exists in decoder_classes, re-register
xinference  | WARNING:root:Key LightweightConvolution2DTransformerDecoder already exists in decoder_classes, re-register
xinference  | WARNING:root:Key DynamicConvolutionTransformerDecoder already exists in decoder_classes, re-register
xinference  | WARNING:root:Key DynamicConvolution2DTransformerDecoder already exists in decoder_classes, re-register
xinference  | /opt/conda/lib/python3.11/site-packages/funasr/train_utils/load_pretrained_model.py:39: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default
 value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorc
h/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could 
be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_glo
bals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this
 experimental feature.
xinference  |   ori_state = torch.load(path, map_location=map_location)
xinference  | WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=2, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTP
SConnection object at 0x7f9cf16c3510>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /api/v1/models/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/r
evisions
xinference  | WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=2, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTP
SConnection object at 0x7f9cf16c3d90>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /api/v1/models/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/r
evisions
xinference  | 2024-12-03 08:58:47,089 xinference.core.worker 148 ERROR    Failed to load model SenseVoiceSmall-0
xinference  | Traceback (most recent call last):
xinference  |   File "/opt/conda/lib/python3.11/site-packages/xinference/core/worker.py", line 897, in launch_builtin_model
xinference  |     await model_ref.load()
xinference  |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/context.py", line 231, in send
xinference  |     return self._process_result_message(result)
xinference  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
xinference  |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
xinference  |     raise message.as_instanceof_cause()
xinference  |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/pool.py", line 667, in send
xinference  |     result = await self._run_coro(message.message_id, coro)
xinference  |     ^^^^^^^^^^^^^^^^^
xinference  |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
xinference  |     return await coro
xinference  |   File "/opt/conda/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
xinference  |     return await super().__on_receive__(message)  # type: ignore
xinference  |     ^^^^^^^^^^^^^^^^^
xinference  |   File "xoscar/core.pyx", line 558, in __on_receive__
xinference  |     raise ex
xinference  |   File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
xinference  |     async with self._lock:
xinference  |     ^^^^^^^^^^^^^^^^^
xinference  |   File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
xinference  |     with debug_async_timeout('actor_lock_timeout',
xinference  |     ^^^^^^^^^^^^^^^^^
xinference  |   File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
xinference  |     result = await result
xinference  |     ^^^^^^^^^^^^^^^^^
xinference  |   File "/opt/conda/lib/python3.11/site-packages/xinference/core/model.py", line 409, in load
xinference  |     self._model.load()
xinference  |     ^^^^^^^^^^^^^^^^^
xinference  |   File "/opt/conda/lib/python3.11/site-packages/xinference/model/audio/funasr.py", line 68, in load
xinference  |     self._model = AutoModel(model=self._model_path, device=self._device, **kwargs)
xinference  |     ^^^^^^^^^^^^^^^^^
xinference  |   File "/opt/conda/lib/python3.11/site-packages/funasr/auto/auto_model.py", line 135, in __init__
xinference  |     vad_model, vad_kwargs = self.build_model(**vad_kwargs)
xinference  |     ^^^^^^^^^^^^^^^^^
xinference  |   File "/opt/conda/lib/python3.11/site-packages/funasr/auto/auto_model.py", line 259, in build_model
xinference  |     assert model_class is not None, f'{kwargs["model"]} is not registered'
xinference  |     ^^^^^^^^^^^^^^^^^
xinference  | AssertionError: [address=0.0.0.0:46195, pid=315] fsmn-vad is not registered

Expected behavior / 期待表现

在已经缓存的情况下并且没有网络时优先使用本地缓存的文件启动模型。

@XprobeBot XprobeBot added the gpu label Dec 5, 2024
@XprobeBot XprobeBot added this to the v1.x milestone Dec 5, 2024
@DankerMu
Copy link

DankerMu commented Dec 5, 2024

我也遇到了这个问题,我是下载了传到离线环境,然后重启xinf发现这个模型没cache,launch也是“fsmn-vad is not registered”

@qinxuye
Copy link
Contributor

qinxuye commented Dec 5, 2024

我们解决一下。vad 模型本身也很小。

@kimi360
Copy link
Author

kimi360 commented Dec 5, 2024

我也遇到了这个问题,我是下载了传到离线环境,然后重启xinf发现这个模型没cache,launch也是“fsmn-vad is not registered”

临时解决可以修改容器中的/opt/conda/lib/python3.11/site-packages/xinference/model/audio/funasr.py文件
大概68行这个位置,在加载fsmn-vad时把本地模型路径带上

    def load(self):
        try:
            from funasr import AutoModel
        except ImportError:
            error_message = "Failed to import module 'funasr'"
            installation_guide = [
                "Please make sure 'funasr' is installed. ",
                "You can install it by `pip install funasr`\n",
            ]

            raise ImportError(f"{error_message}\n\n{''.join(installation_guide)}")

        if self._device is None:
            self._device = get_available_device()
        else:
            if not is_device_available(self._device):
                raise ValueError(f"Device {self._device} is not available!")

        kwargs = self._model_spec.default_model_config.copy()
        kwargs.update(self._kwargs)
        logger.info("Loading FunASR model with kwargs: %s", kwargs)
        # insert start
        if kwargs.get("vad_model")=="fsmn-vad":
            kwargs["vad_model"]="/root/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch"
        # insert end
        self._model = AutoModel(model=self._model_path, device=self._device, **kwargs)

@DankerMu
Copy link

DankerMu commented Dec 5, 2024

我也遇到了这个问题,我是下载了传到离线环境,然后重启xinf发现这个模型没cache,launch也是“fsmn-vad is not registered”

临时解决可以修改容器中的/opt/conda/lib/python3.11/site-packages/xinference/model/audio/funasr.py文件 大概68行这个位置,在加载fsmn-vad时把本地模型路径带上

    def load(self):
        try:
            from funasr import AutoModel
        except ImportError:
            error_message = "Failed to import module 'funasr'"
            installation_guide = [
                "Please make sure 'funasr' is installed. ",
                "You can install it by `pip install funasr`\n",
            ]

            raise ImportError(f"{error_message}\n\n{''.join(installation_guide)}")

        if self._device is None:
            self._device = get_available_device()
        else:
            if not is_device_available(self._device):
                raise ValueError(f"Device {self._device} is not available!")

        kwargs = self._model_spec.default_model_config.copy()
        kwargs.update(self._kwargs)
        logger.info("Loading FunASR model with kwargs: %s", kwargs)
        # insert start
        if kwargs.get("vad_model")=="fsmn-vad":
            kwargs["vad_model"]="/root/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch"
        # insert end
        self._model = AutoModel(model=self._model_path, device=self._device, **kwargs)

修改之后出现torchaudio has no attribute "lib", 还是launch不了

@DankerMu
Copy link

DankerMu commented Dec 5, 2024

是torchaudio 版本问题还是循环import?

@kimi360
Copy link
Author

kimi360 commented Dec 5, 2024

不太清楚哎,发个日志再查查

@DankerMu
Copy link

DankerMu commented Dec 5, 2024

不太清楚哎,发个日志再查查

离线环境不太好贴日志,看了下,最终错误在torchaudio/_extension/utils.py,check cuda verison时出错,
version = torchaudio.lib._torchaudio.cuda_version()

@kimi360
Copy link
Author

kimi360 commented Dec 5, 2024

下个仅cpu版本,用cpu起试试

@qinxuye
Copy link
Contributor

qinxuye commented Dec 11, 2024

https://inference--2654.org.readthedocs.build/zh-cn/2654/models/model_abilities/audio.html#sensevoicesmall-offline-usage

其实不需要修改源码就可以,写了个一个文档,你们可以试下。(#2654 合并后文档链接会失效)

@kimi360
Copy link
Author

kimi360 commented Dec 11, 2024

https://inference--2654.org.readthedocs.build/zh-cn/2654/models/model_abilities/audio.html#sensevoicesmall-offline-usage

其实不需要修改源码就可以,写了个一个文档,你们可以试下。(#2654 合并后文档链接会失效)

感谢,明天我试一下

@kimi360 kimi360 closed this as completed Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants