Skip to content

Commit 97bb695

Browse files
authored
Merge pull request #884 from pipecat-ai/docs/pr-4683
docs: update for pipecat PR #4683
2 parents f627789 + bb872b0 commit 97bb695

3 files changed

Lines changed: 140 additions & 0 deletions

File tree

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
---
2+
title: "Moonshine"
3+
description: "Speech-to-text service implementation using locally-downloaded Moonshine ONNX models"
4+
---
5+
6+
## Overview
7+
8+
`MoonshineSTTService` provides offline speech recognition using Moonshine's small, fast ASR models running locally on the CPU via ONNX Runtime. No GPU required, no API key needed - models download once on first use and are cached locally for privacy-focused transcription.
9+
10+
<CardGroup cols={2}>
11+
<Card
12+
title="Moonshine STT API Reference"
13+
icon="code"
14+
href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.moonshine.stt.html"
15+
>
16+
Pipecat's API methods for Moonshine STT integration
17+
</Card>
18+
<Card
19+
title="Moonshine Example"
20+
icon="play"
21+
href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-moonshine.py"
22+
>
23+
Complete example with Moonshine STT
24+
</Card>
25+
<Card
26+
title="Moonshine Documentation"
27+
icon="book"
28+
href="https://github.com/moonshine-ai/moonshine"
29+
>
30+
Moonshine ASR model details and research
31+
</Card>
32+
<Card
33+
title="Moonshine Voice Package"
34+
icon="microphone"
35+
href="https://pypi.org/project/moonshine-voice/"
36+
>
37+
Python package for Moonshine models
38+
</Card>
39+
</CardGroup>
40+
41+
## Installation
42+
43+
```bash
44+
uv add "pipecat-ai[moonshine]"
45+
```
46+
47+
## Prerequisites
48+
49+
### Local Model Setup
50+
51+
Before using Moonshine STT service, you need:
52+
53+
1. **Model Selection**: Choose appropriate Moonshine model size (tiny, base, small-streaming, medium-streaming)
54+
2. **Storage Space**: Ensure sufficient disk space for model downloads (models are cached after first use)
55+
3. **CPU Resources**: Moonshine runs efficiently on CPU via ONNX Runtime
56+
57+
### Configuration Options
58+
59+
- **Model Size**: Balance between accuracy and performance based on your needs
60+
- **Language Support**: Moonshine supports English, Spanish, and other languages
61+
- **No API Key**: Runs entirely locally for complete privacy
62+
63+
<Tip>
64+
No API keys or GPU required - Moonshine runs efficiently on CPU for complete privacy.
65+
</Tip>
66+
67+
## Configuration
68+
69+
<ParamField path="settings" type="MoonshineSTTService.Settings" default="None">
70+
Runtime-configurable settings for the STT service. See [MoonshineSTTService Settings](#moonshinesttsettings) below.
71+
</ParamField>
72+
73+
## MoonshineSTTSettings
74+
75+
Runtime-configurable settings passed via the `settings` constructor argument using `MoonshineSTTService.Settings(...)`. These can be updated mid-conversation with `STTUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.
76+
77+
| Parameter | Type | Default | Description |
78+
| ---------- | ----------------- | ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
79+
| `model` | `str \| Model` | `Model.SMALL_STREAMING` | Moonshine model architecture. Available models: `TINY`, `BASE`, `TINY_STREAMING`, `BASE_STREAMING`, `SMALL_STREAMING` (default), `MEDIUM_STREAMING`. |
80+
| `language` | `Language \| str` | `Language.EN` | Language for transcription. Moonshine supports English, Spanish, and other languages. The base language code is used (e.g., "en" from "en-US"). _(Inherited from base STT settings.)_ |
81+
82+
## Usage
83+
84+
### Basic Setup
85+
86+
```python
87+
from pipecat.services.moonshine.stt import MoonshineSTTService
88+
89+
stt = MoonshineSTTService()
90+
```
91+
92+
### With Custom Model
93+
94+
```python
95+
from pipecat.services.moonshine.stt import MoonshineSTTService, Model
96+
97+
stt = MoonshineSTTService(
98+
settings=MoonshineSTTService.Settings(
99+
model=Model.MEDIUM_STREAMING,
100+
),
101+
)
102+
```
103+
104+
### With Custom Language
105+
106+
```python
107+
from pipecat.services.moonshine.stt import MoonshineSTTService, Model
108+
from pipecat.transcriptions.language import Language
109+
110+
stt = MoonshineSTTService(
111+
settings=MoonshineSTTService.Settings(
112+
model=Model.SMALL_STREAMING,
113+
language=Language.ES,
114+
),
115+
)
116+
```
117+
118+
### With Model as String
119+
120+
```python
121+
from pipecat.services.moonshine.stt import MoonshineSTTService
122+
123+
stt = MoonshineSTTService(
124+
settings=MoonshineSTTService.Settings(
125+
model="base",
126+
),
127+
)
128+
```
129+
130+
## Notes
131+
132+
- **First run downloads**: The selected model downloads from the Moonshine model hub on first use and is cached locally. Later runs load it from the cache.
133+
- **Segmented transcription**: `MoonshineSTTService` extends `SegmentedSTTService`, meaning it processes complete audio segments after VAD detects the user has stopped speaking.
134+
- **CPU-only**: Moonshine runs efficiently on CPU via ONNX Runtime, so no GPU is required. This makes it ideal for resource-constrained environments.
135+
- **Audio format**: Expects 16-bit mono PCM audio at 16 kHz sample rate.
136+
- **Model variants**: The streaming-capable models (`TINY_STREAMING`, `SMALL_STREAMING`, `MEDIUM_STREAMING`) can be run in batch mode just like the non-streaming variants. The larger streaming models (`SMALL_STREAMING`, `MEDIUM_STREAMING`) are only available in streaming form.
137+
- **Language support**: Moonshine supports multiple languages (English, Spanish, and others). The service uses the base language code (e.g., "en" from "en-US").
138+
- **No external dependencies**: Unlike API-based STT services, Moonshine requires no API keys or network connectivity after the initial model download.

api-reference/server/services/supported-services.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ Speech-to-Text services receive and audio input and output transcriptions.
5151
| [Gradium](/api-reference/server/services/stt/gradium) | `uv add "pipecat-ai[gradium]"` |
5252
| [Groq (Whisper)](/api-reference/server/services/stt/groq) | `uv add "pipecat-ai[groq]"` |
5353
| [Mistral](/api-reference/server/services/stt/mistral) | `uv add "pipecat-ai[mistral]"` |
54+
| [Moonshine](/api-reference/server/services/stt/moonshine) | `uv add "pipecat-ai[moonshine]"` |
5455
| [NVIDIA](/api-reference/server/services/stt/nvidia) | `uv add "pipecat-ai[nvidia]"` |
5556
| [OpenAI](/api-reference/server/services/stt/openai) | `uv add "pipecat-ai[openai]"` |
5657
| [Sarvam](/api-reference/server/services/stt/sarvam) | `uv add "pipecat-ai[sarvam]"` |

docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -358,6 +358,7 @@
358358
"api-reference/server/services/stt/gradium",
359359
"api-reference/server/services/stt/groq",
360360
"api-reference/server/services/stt/mistral",
361+
"api-reference/server/services/stt/moonshine",
361362
"api-reference/server/services/stt/nvidia",
362363
"api-reference/server/services/stt/openai",
363364
"api-reference/server/services/stt/sarvam",

0 commit comments

Comments
 (0)