Skip to content

Commit a3fb63f

Browse files
authored
Merge pull request #888 from Vonage/feature/VIDPA-1417/update_docs_with_captions_and_ind
Adding docs for new Vonage WebRTC transport features in PR #4686
2 parents 020e4a1 + 5e7fec1 commit a3fb63f

1 file changed

Lines changed: 42 additions & 3 deletions

File tree

api-reference/server/services/transport/vonage.mdx

Lines changed: 42 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -83,8 +83,10 @@ Before using `VonageVideoConnectorTransport`, you need:
8383

8484
- **Vonage Video API**: Integrate with Vonage's managed WebRTC infrastructure
8585
- **Audio and Video I/O**: Bidirectional audio and video streaming
86+
- **Captions**: Receive real-time transcription frames from session participants
87+
- **Individual Audio Streams**: Subscribe to per-participant audio in addition to the session-level mix
8688
- **Participant Management**: Stream subscription and participant lifecycle events
87-
- **Auto-subscription**: Optionally auto-subscribe to incoming audio and video streams
89+
- **Auto-subscription**: Optionally auto-subscribe to incoming audio, video, and captions streams
8890
- **Interruption Handling**: Automatic media buffer clearing on pipeline interruptions
8991

9092
## Configuration
@@ -143,6 +145,16 @@ Inherits all parameters from [TransportParams](/api-reference/server/services/tr
143145
participants.
144146
</ParamField>
145147

148+
<ParamField path="captions_in_enabled" type="bool" default="False">
149+
Whether to enable captions input. When enabled, the transport will process
150+
incoming transcription frames from subscribers.
151+
</ParamField>
152+
153+
<ParamField path="captions_in_auto_subscribe" type="bool" default="False">
154+
Whether to automatically subscribe to incoming captions streams from session
155+
participants. Requires `captions_in_enabled` to be `True`.
156+
</ParamField>
157+
146158
<ParamField
147159
path="video_in_preferred_resolution"
148160
type="tuple[int, int]"
@@ -171,7 +183,7 @@ Inherits all parameters from [TransportParams](/api-reference/server/services/tr
171183

172184
### SubscribeSettings
173185

174-
Used with `subscribe_to_stream()` to control per-stream subscription quality when `audio_in_auto_subscribe` or `video_in_auto_subscribe` are disabled.
186+
Used with `subscribe_to_stream()` to control per-stream subscription quality when `audio_in_auto_subscribe`, `video_in_auto_subscribe`, or `captions_in_auto_subscribe` are disabled.
175187

176188
<ParamField path="subscribe_to_audio" type="bool" default="True">
177189
Whether to subscribe to the stream's audio track.
@@ -181,6 +193,10 @@ Used with `subscribe_to_stream()` to control per-stream subscription quality whe
181193
Whether to subscribe to the stream's video track.
182194
</ParamField>
183195

196+
<ParamField path="subscribe_to_captions" type="bool" default="False">
197+
Whether to subscribe to the stream's captions track.
198+
</ParamField>
199+
184200
<ParamField path="preferred_resolution" type="tuple[int, int]" default="None">
185201
Preferred `(width, height)` resolution for the subscribed video track. The
186202
server provides the closest available quality if the exact resolution is
@@ -227,7 +243,7 @@ See the [complete example](https://github.com/pipecat-ai/pipecat/blob/main/examp
227243

228244
### Subscribing to streams manually
229245

230-
When `audio_in_auto_subscribe` or `video_in_auto_subscribe` is disabled, subscribe to a specific participant's stream with `subscribe_to_stream()`, passing [SubscribeSettings](#subscribesettings) to control which tracks are received and at what quality. The `streamId` is available from the `on_participant_joined` event data.
246+
When `audio_in_auto_subscribe`, `video_in_auto_subscribe`, or `captions_in_auto_subscribe` is disabled, subscribe to a specific participant's stream with `subscribe_to_stream()`, passing [SubscribeSettings](#subscribesettings) to control which tracks are received and at what quality. The `streamId` is available from the `on_participant_joined` event data.
231247

232248
```python
233249
from pipecat.transports.vonage.video_connector import SubscribeSettings
@@ -237,12 +253,35 @@ await transport.subscribe_to_stream(
237253
params=SubscribeSettings(
238254
subscribe_to_audio=True,
239255
subscribe_to_video=True,
256+
subscribe_to_captions=True,
240257
preferred_resolution=(1280, 720),
241258
preferred_framerate=30,
242259
),
243260
)
244261
```
245262

263+
### Receiving captions
264+
265+
Enable captions to receive real-time `TranscriptionFrame` and `InterimTranscriptionFrame` from participants. Each frame includes the `user_id` (stream ID) of the speaker.
266+
267+
```python
268+
transport = VonageVideoConnectorTransport(
269+
application_id,
270+
session_id,
271+
token,
272+
VonageVideoConnectorTransportParams(
273+
audio_in_enabled=True,
274+
audio_out_enabled=True,
275+
captions_in_enabled=True,
276+
captions_in_auto_subscribe=True,
277+
),
278+
)
279+
```
280+
281+
### Individual audio streams
282+
283+
By default, audio input is received as a session-level mix of all participants. When you subscribe to a stream (either manually or via auto-subscribe), the transport also delivers per-subscriber `UserAudioRawFrame` frames with a `user_id` field identifying the source participant. This enables use cases like speaker diarization or per-participant processing.
284+
246285
## Event Handlers
247286

248287
`VonageVideoConnectorTransport` provides event handlers for session lifecycle, participant stream management, and subscriber connectivity. Register handlers using the `@event_handler` decorator on the transport instance.

0 commit comments

Comments
 (0)