|
| 1 | +--- |
| 2 | +title: "Moonshine" |
| 3 | +description: "Speech-to-text service implementation using locally-downloaded Moonshine ONNX models" |
| 4 | +--- |
| 5 | + |
| 6 | +## Overview |
| 7 | + |
| 8 | +`MoonshineSTTService` provides offline speech recognition using Moonshine's small, fast ASR models running locally on the CPU via ONNX Runtime. No GPU required, no API key needed - models download once on first use and are cached locally for privacy-focused transcription. |
| 9 | + |
| 10 | +<CardGroup cols={2}> |
| 11 | + <Card |
| 12 | + title="Moonshine STT API Reference" |
| 13 | + icon="code" |
| 14 | + href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.moonshine.stt.html" |
| 15 | + > |
| 16 | + Pipecat's API methods for Moonshine STT integration |
| 17 | + </Card> |
| 18 | + <Card |
| 19 | + title="Moonshine Example" |
| 20 | + icon="play" |
| 21 | + href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-moonshine.py" |
| 22 | + > |
| 23 | + Complete example with Moonshine STT |
| 24 | + </Card> |
| 25 | + <Card |
| 26 | + title="Moonshine Documentation" |
| 27 | + icon="book" |
| 28 | + href="https://github.com/moonshine-ai/moonshine" |
| 29 | + > |
| 30 | + Moonshine ASR model details and research |
| 31 | + </Card> |
| 32 | + <Card |
| 33 | + title="Moonshine Voice Package" |
| 34 | + icon="microphone" |
| 35 | + href="https://pypi.org/project/moonshine-voice/" |
| 36 | + > |
| 37 | + Python package for Moonshine models |
| 38 | + </Card> |
| 39 | +</CardGroup> |
| 40 | + |
| 41 | +## Installation |
| 42 | + |
| 43 | +```bash |
| 44 | +uv add "pipecat-ai[moonshine]" |
| 45 | +``` |
| 46 | + |
| 47 | +## Prerequisites |
| 48 | + |
| 49 | +### Local Model Setup |
| 50 | + |
| 51 | +Before using Moonshine STT service, you need: |
| 52 | + |
| 53 | +1. **Model Selection**: Choose appropriate Moonshine model size (tiny, base, small-streaming, medium-streaming) |
| 54 | +2. **Storage Space**: Ensure sufficient disk space for model downloads (models are cached after first use) |
| 55 | +3. **CPU Resources**: Moonshine runs efficiently on CPU via ONNX Runtime |
| 56 | + |
| 57 | +### Configuration Options |
| 58 | + |
| 59 | +- **Model Size**: Balance between accuracy and performance based on your needs |
| 60 | +- **Language Support**: Moonshine supports English, Spanish, and other languages |
| 61 | +- **No API Key**: Runs entirely locally for complete privacy |
| 62 | + |
| 63 | +<Tip> |
| 64 | + No API keys or GPU required - Moonshine runs efficiently on CPU for complete privacy. |
| 65 | +</Tip> |
| 66 | + |
| 67 | +## Configuration |
| 68 | + |
| 69 | +<ParamField path="settings" type="MoonshineSTTService.Settings" default="None"> |
| 70 | + Runtime-configurable settings for the STT service. See [MoonshineSTTService Settings](#moonshinesttsettings) below. |
| 71 | +</ParamField> |
| 72 | + |
| 73 | +## MoonshineSTTSettings |
| 74 | + |
| 75 | +Runtime-configurable settings passed via the `settings` constructor argument using `MoonshineSTTService.Settings(...)`. These can be updated mid-conversation with `STTUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details. |
| 76 | + |
| 77 | +| Parameter | Type | Default | Description | |
| 78 | +| ---------- | ----------------- | ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- | |
| 79 | +| `model` | `str \| Model` | `Model.SMALL_STREAMING` | Moonshine model architecture. Available models: `TINY`, `BASE`, `TINY_STREAMING`, `BASE_STREAMING`, `SMALL_STREAMING` (default), `MEDIUM_STREAMING`. | |
| 80 | +| `language` | `Language \| str` | `Language.EN` | Language for transcription. Moonshine supports English, Spanish, and other languages. The base language code is used (e.g., "en" from "en-US"). _(Inherited from base STT settings.)_ | |
| 81 | + |
| 82 | +## Usage |
| 83 | + |
| 84 | +### Basic Setup |
| 85 | + |
| 86 | +```python |
| 87 | +from pipecat.services.moonshine.stt import MoonshineSTTService |
| 88 | + |
| 89 | +stt = MoonshineSTTService() |
| 90 | +``` |
| 91 | + |
| 92 | +### With Custom Model |
| 93 | + |
| 94 | +```python |
| 95 | +from pipecat.services.moonshine.stt import MoonshineSTTService, Model |
| 96 | + |
| 97 | +stt = MoonshineSTTService( |
| 98 | + settings=MoonshineSTTService.Settings( |
| 99 | + model=Model.MEDIUM_STREAMING, |
| 100 | + ), |
| 101 | +) |
| 102 | +``` |
| 103 | + |
| 104 | +### With Custom Language |
| 105 | + |
| 106 | +```python |
| 107 | +from pipecat.services.moonshine.stt import MoonshineSTTService, Model |
| 108 | +from pipecat.transcriptions.language import Language |
| 109 | + |
| 110 | +stt = MoonshineSTTService( |
| 111 | + settings=MoonshineSTTService.Settings( |
| 112 | + model=Model.SMALL_STREAMING, |
| 113 | + language=Language.ES, |
| 114 | + ), |
| 115 | +) |
| 116 | +``` |
| 117 | + |
| 118 | +### With Model as String |
| 119 | + |
| 120 | +```python |
| 121 | +from pipecat.services.moonshine.stt import MoonshineSTTService |
| 122 | + |
| 123 | +stt = MoonshineSTTService( |
| 124 | + settings=MoonshineSTTService.Settings( |
| 125 | + model="base", |
| 126 | + ), |
| 127 | +) |
| 128 | +``` |
| 129 | + |
| 130 | +## Notes |
| 131 | + |
| 132 | +- **First run downloads**: The selected model downloads from the Moonshine model hub on first use and is cached locally. Later runs load it from the cache. |
| 133 | +- **Segmented transcription**: `MoonshineSTTService` extends `SegmentedSTTService`, meaning it processes complete audio segments after VAD detects the user has stopped speaking. |
| 134 | +- **CPU-only**: Moonshine runs efficiently on CPU via ONNX Runtime, so no GPU is required. This makes it ideal for resource-constrained environments. |
| 135 | +- **Audio format**: Expects 16-bit mono PCM audio at 16 kHz sample rate. |
| 136 | +- **Model variants**: The streaming-capable models (`TINY_STREAMING`, `SMALL_STREAMING`, `MEDIUM_STREAMING`) can be run in batch mode just like the non-streaming variants. The larger streaming models (`SMALL_STREAMING`, `MEDIUM_STREAMING`) are only available in streaming form. |
| 137 | +- **Language support**: Moonshine supports multiple languages (English, Spanish, and others). The service uses the base language code (e.g., "en" from "en-US"). |
| 138 | +- **No external dependencies**: Unlike API-based STT services, Moonshine requires no API keys or network connectivity after the initial model download. |
0 commit comments