Skip to content

Commit 5752c50

Browse files
authored
Merge pull request #939 from pipecat-ai/docs/pr-4054
docs: update for pipecat PR #4054
2 parents d99559e + 6a6ea6c commit 5752c50

5 files changed

Lines changed: 306 additions & 2 deletions

File tree

api-reference/server/services/llm/together.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ from pipecat.services.together import TogetherLLMService
9595

9696
llm = TogetherLLMService(
9797
api_key=os.getenv("TOGETHER_API_KEY"),
98-
model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
98+
model="zai-org/GLM-5.1",
9999
)
100100
```
101101

@@ -107,7 +107,7 @@ from pipecat.services.together import TogetherLLMService
107107
llm = TogetherLLMService(
108108
api_key=os.getenv("TOGETHER_API_KEY"),
109109
settings=TogetherLLMService.Settings(
110-
model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
110+
model="zai-org/GLM-5.1",
111111
temperature=0.7,
112112
top_p=0.9,
113113
max_completion_tokens=1024,
Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
---
2+
title: "Together AI"
3+
description: "Speech-to-text service using Together AI's real-time transcription API"
4+
---
5+
6+
## Overview
7+
8+
`TogetherSTTService` provides real-time speech recognition using Together AI's WebSocket API with OpenAI-compatible speech-to-text endpoints. It supports streaming transcription with interim results and automatic reconnection.
9+
10+
<CardGroup cols={2}>
11+
<Card
12+
title="Together AI STT API Reference"
13+
icon="code"
14+
href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.together.stt.html"
15+
>
16+
Pipecat's API methods for Together AI STT
17+
</Card>
18+
<Card
19+
title="Example Implementation"
20+
icon="play"
21+
href="https://github.com/pipecat-ai/pipecat/blob/main/examples/transcription/transcription-together.py"
22+
>
23+
Complete transcription example
24+
</Card>
25+
<Card
26+
title="Together AI Documentation"
27+
icon="book"
28+
href="https://docs.together.ai/reference/audio-transcriptions-realtime"
29+
>
30+
Official Together AI Realtime API documentation
31+
</Card>
32+
<Card
33+
title="Together AI Platform"
34+
icon="microphone"
35+
href="https://together.ai/"
36+
>
37+
Access models and manage API keys
38+
</Card>
39+
</CardGroup>
40+
41+
## Installation
42+
43+
To use Together AI STT services, install the required dependencies:
44+
45+
```bash
46+
uv add "pipecat-ai[together]"
47+
```
48+
49+
## Prerequisites
50+
51+
### Together AI Account Setup
52+
53+
Before using Together AI STT services, you need:
54+
55+
1. **Together AI Account**: Sign up at [Together AI](https://together.ai/)
56+
2. **API Key**: Generate an API key from your account dashboard
57+
3. **Model Selection**: Choose from available transcription models
58+
59+
### Required Environment Variables
60+
61+
- `TOGETHER_API_KEY`: Your Together AI API key for authentication
62+
63+
## Configuration
64+
65+
<ParamField path="api_key" type="str" required>
66+
Together AI API key for authentication.
67+
</ParamField>
68+
69+
<ParamField path="sample_rate" type="int" default="None">
70+
Audio sample rate in Hz. When `None`, uses the pipeline's configured sample
71+
rate.
72+
</ParamField>
73+
74+
<ParamField path="base_url" type="str" default="wss://api.together.ai/v1">
75+
WebSocket base URL for Together AI API.
76+
</ParamField>
77+
78+
<ParamField path="settings" type="TogetherSTTService.Settings" default="None">
79+
Runtime-configurable settings. See [Settings](#settings) below.
80+
</ParamField>
81+
82+
<ParamField path="ttfs_p99_latency" type="float" default="1.00">
83+
P99 latency from speech end to final transcript in seconds. Override for your
84+
deployment. See
85+
[https://github.com/pipecat-ai/stt-benchmark](https://github.com/pipecat-ai/stt-benchmark).
86+
</ParamField>
87+
88+
### Settings
89+
90+
Runtime-configurable settings passed via the `settings` constructor argument using `TogetherSTTService.Settings(...)`. These can be updated mid-conversation with `STTUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.
91+
92+
| Parameter | Type | Default | Description |
93+
| ---------- | ----------------- | --------------------------- | ----------------------------------------- |
94+
| `model` | `str` | `"openai/whisper-large-v3"` | Model identifier. _(Inherited.)_ |
95+
| `language` | `Language \| str` | `Language.EN` | Language for transcription. _(Inherited)_ |
96+
97+
## Usage
98+
99+
### Basic Setup
100+
101+
```python
102+
import os
103+
from pipecat.services.together import TogetherSTTService
104+
105+
stt = TogetherSTTService(
106+
api_key=os.getenv("TOGETHER_API_KEY"),
107+
)
108+
```
109+
110+
### With Custom Settings
111+
112+
```python
113+
from pipecat.services.together import TogetherSTTService
114+
from pipecat.transcriptions.language import Language
115+
116+
stt = TogetherSTTService(
117+
api_key=os.getenv("TOGETHER_API_KEY"),
118+
settings=TogetherSTTService.Settings(
119+
model="openai/whisper-large-v3",
120+
language=Language.EN,
121+
),
122+
)
123+
```
124+
125+
### In a Voice Pipeline
126+
127+
```python
128+
from pipecat.audio.vad.silero import SileroVADAnalyzer
129+
from pipecat.pipeline.pipeline import Pipeline
130+
from pipecat.processors.audio.vad_processor import VADProcessor
131+
from pipecat.services.together import TogetherSTTService
132+
133+
stt = TogetherSTTService(api_key=os.getenv("TOGETHER_API_KEY"))
134+
vad_processor = VADProcessor(vad_analyzer=SileroVADAnalyzer())
135+
136+
pipeline = Pipeline([
137+
transport.input(),
138+
vad_processor,
139+
stt,
140+
# ... rest of pipeline
141+
])
142+
```
143+
144+
## Notes
145+
146+
- Together AI's STT service uses an OpenAI-compatible WebSocket protocol for real-time transcription.
147+
- The service automatically handles reconnection on connection errors.
148+
- Transcription is committed when `VADUserStoppedSpeakingFrame` is received.

api-reference/server/services/supported-services.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,7 @@ Speech-to-Text services receive and audio input and output transcriptions.
155155
| [Smallest](/api-reference/server/services/stt/smallest) | `uv add "pipecat-ai[smallest]"` | Pipecat |
156156
| [Soniox](/api-reference/server/services/stt/soniox) | `uv add "pipecat-ai[soniox]"` | Pipecat |
157157
| [Speechmatics](/api-reference/server/services/stt/speechmatics) | `uv add "pipecat-ai[speechmatics]"` | Pipecat |
158+
| [Together AI](/api-reference/server/services/stt/together) | `uv add "pipecat-ai[together]"` | Pipecat |
158159
| [Uplift AI](/api-reference/server/services/stt/upliftai) | `uv pip install git+https://github.com/havkerboi123/pipecat-upliftai-stt.git` | Community |
159160
| [Whisper](/api-reference/server/services/stt/whisper) | `uv add "pipecat-ai[whisper]"` | Pipecat |
160161
| [xAI](/api-reference/server/services/stt/xai) | `uv add "pipecat-ai[xai]"` | Pipecat |
@@ -201,6 +202,7 @@ Text-to-Speech services receive text input and output audio streams or chunks.
201202
| [Soniox](/api-reference/server/services/tts/soniox) | `uv add "pipecat-ai[soniox]"` | Pipecat |
202203
| [Speechmatics](/api-reference/server/services/tts/speechmatics) | `uv add "pipecat-ai[speechmatics]"` | Pipecat |
203204
| [Supertonic](/api-reference/server/services/tts/supertonic) | `uv add pipecat-supertonic` | Community |
205+
| [Together AI](/api-reference/server/services/tts/together) | `uv add "pipecat-ai[together]"` | Pipecat |
204206
| [Typecast](/api-reference/server/services/tts/typecast) | `uv add pipecat-ai-typecast` | Community |
205207
| [Uplift AI](/api-reference/server/services/tts/upliftai) | `uv pip install git+https://github.com/havkerboi123/pipecat-upliftai-tts.git` | Community |
206208
| [Voice.ai](/api-reference/server/services/tts/voiceai) | `uv pip install git+https://github.com/voice-ai/voice-ai-pipecat-tts.git` | Community |
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
---
2+
title: "Together AI"
3+
description: "Text-to-speech service using Together AI's real-time WebSocket API"
4+
---
5+
6+
## Overview
7+
8+
`TogetherTTSService` provides real-time text-to-speech using Together AI's WebSocket API. It supports streaming synthesis with configurable voice and model options, interruption handling, and automatic reconnection.
9+
10+
<CardGroup cols={2}>
11+
<Card
12+
title="Together AI TTS API Reference"
13+
icon="code"
14+
href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.together.tts.html"
15+
>
16+
Pipecat's API methods for Together AI TTS
17+
</Card>
18+
<Card
19+
title="Example Implementation"
20+
icon="play"
21+
href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-together.py"
22+
>
23+
Complete voice bot example
24+
</Card>
25+
<Card
26+
title="Together AI Documentation"
27+
icon="book"
28+
href="https://docs.together.ai/reference/audio-speech-websocket"
29+
>
30+
Official Together AI TTS WebSocket API documentation
31+
</Card>
32+
<Card
33+
title="Together AI Platform"
34+
icon="microphone"
35+
href="https://together.ai/"
36+
>
37+
Access models and manage API keys
38+
</Card>
39+
</CardGroup>
40+
41+
## Installation
42+
43+
To use Together AI TTS services, install the required dependencies:
44+
45+
```bash
46+
uv add "pipecat-ai[together]"
47+
```
48+
49+
## Prerequisites
50+
51+
### Together AI Account Setup
52+
53+
Before using Together AI TTS services, you need:
54+
55+
1. **Together AI Account**: Sign up at [Together AI](https://together.ai/)
56+
2. **API Key**: Generate an API key from your account dashboard
57+
3. **Model Selection**: Choose from available TTS models and voices
58+
59+
### Required Environment Variables
60+
61+
- `TOGETHER_API_KEY`: Your Together AI API key for authentication
62+
63+
## Configuration
64+
65+
<ParamField path="api_key" type="str" required>
66+
Together AI API key for authentication.
67+
</ParamField>
68+
69+
<ParamField
70+
path="url"
71+
type="str"
72+
default="wss://api.together.ai/v1/audio/speech/websocket"
73+
>
74+
WebSocket URL for Together AI TTS API.
75+
</ParamField>
76+
77+
<ParamField path="sample_rate" type="int" default="24000">
78+
Output sample rate for emitted PCM frames. Together AI streams at 24 kHz and
79+
does not support other rates.
80+
</ParamField>
81+
82+
<ParamField path="settings" type="TogetherTTSService.Settings" default="None">
83+
Runtime-configurable settings. See [Settings](#settings) below.
84+
</ParamField>
85+
86+
### Settings
87+
88+
Runtime-configurable settings passed via the `settings` constructor argument using `TogetherTTSService.Settings(...)`. These can be updated mid-conversation with `TTSUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.
89+
90+
| Parameter | Type | Default | Description |
91+
| -------------------- | ----------------- | ---------------------- | ------------------------------------------------------------- |
92+
| `model` | `str` | `"hexgrad/Kokoro-82M"` | Model identifier. _(Inherited.)_ |
93+
| `voice` | `str` | `"af_heart"` | Voice identifier. _(Inherited.)_ |
94+
| `language` | `Language \| str` | `Language.EN` | Language for synthesis. _(Inherited.)_ |
95+
| `max_partial_length` | `int \| None` | `None` | Maximum partial text length for streaming. `None` for no cap. |
96+
97+
## Usage
98+
99+
### Basic Setup
100+
101+
```python
102+
import os
103+
from pipecat.services.together import TogetherTTSService
104+
105+
tts = TogetherTTSService(
106+
api_key=os.getenv("TOGETHER_API_KEY"),
107+
)
108+
```
109+
110+
### With Custom Settings
111+
112+
```python
113+
from pipecat.services.together import TogetherTTSService
114+
from pipecat.transcriptions.language import Language
115+
116+
tts = TogetherTTSService(
117+
api_key=os.getenv("TOGETHER_API_KEY"),
118+
settings=TogetherTTSService.Settings(
119+
model="hexgrad/Kokoro-82M",
120+
voice="af_heart",
121+
language=Language.EN,
122+
),
123+
)
124+
```
125+
126+
### In a Voice Pipeline
127+
128+
```python
129+
from pipecat.pipeline.pipeline import Pipeline
130+
from pipecat.services.together import TogetherTTSService
131+
132+
tts = TogetherTTSService(
133+
api_key=os.getenv("TOGETHER_API_KEY"),
134+
settings=TogetherTTSService.Settings(
135+
voice="af_heart",
136+
model="hexgrad/Kokoro-82M",
137+
),
138+
)
139+
140+
pipeline = Pipeline([
141+
# ... upstream processors
142+
llm,
143+
tts,
144+
transport.output(),
145+
])
146+
```
147+
148+
## Notes
149+
150+
- Together AI TTS streams audio at 24 kHz. The service outputs 24 kHz signed 16-bit mono PCM; the transport layer resamples to the pipeline's configured rate if needed.
151+
- The service supports interruption handling and automatically clears the text buffer when interrupted.
152+
- Audio is streamed incrementally via WebSocket deltas for low-latency synthesis.

docs.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -374,6 +374,7 @@
374374
"api-reference/server/services/stt/smallest",
375375
"api-reference/server/services/stt/soniox",
376376
"api-reference/server/services/stt/speechmatics",
377+
"api-reference/server/services/stt/together",
377378
"api-reference/server/services/stt/upliftai",
378379
"api-reference/server/services/stt/whisper",
379380
"api-reference/server/services/stt/xai"
@@ -448,6 +449,7 @@
448449
"api-reference/server/services/tts/soniox",
449450
"api-reference/server/services/tts/speechmatics",
450451
"api-reference/server/services/tts/supertonic",
452+
"api-reference/server/services/tts/together",
451453
"api-reference/server/services/tts/tts-cache",
452454
"api-reference/server/services/tts/typecast",
453455
"api-reference/server/services/tts/upliftai",

0 commit comments

Comments
 (0)