MediaSFU Translate

Live spoken translation for meetings and calls

Let speakers use their preferred language while listeners choose the voice language they hear. MediaSFU runs the real-time STT to LLM to TTS pipeline with curated language modes, optional AI notes, transcripts, and bring-your-own or system AI provider options.

Try Translation Demo Configure Translation View SDKs

50+ curated languages supported

Real-time voice translation (<1s latency)

Per-participant spoken & listen language selection

Three modes: any, allowlist, blocklist

Deepgram STT + OpenAI/Claude LLM + ElevenLabs TTS

System pool or bring-your-own AI credentials

Optional AI Notes and notes-only room mode

Voice overrides per language

Works with Meetings & Voice calls

Try Translation Before Setup

Use a meeting demo to understand the experience, then configure system or bring-your-own AI providers.

Meeting Translation Demo

Join a meeting-style demo and test spoken translation behavior.

Translation Guide

Configure STT, LLM, TTS, language modes, and AI notes behavior.

System Translation Settings

Use system translation configs when you do not want provider setup first.

Widget Preview

See how meeting and agent widgets can expose translated workflows.

Perfect For

Global Teams

Daily meetings where each participant speaks their native language and hears everyone else in their preferred language — automatically. Per-participant selection means no one-size-fits-all compromise.

Learn more

EdTech

Lectures and workshops accessible to international students in real-time. Use allowlist mode to restrict to class-relevant languages. Transcripts included for study notes.

Learn more

Healthcare

Telehealth consultations with patients who speak different languages. Blocklist mode ensures only medically-relevant languages are available. System pool credentials simplify setup.

Learn more

What You Can Do

Feature-rich tools designed for real-world workflows

Complete STT → LLM → TTS Pipeline

Three-stage real-time pipeline: speech is transcribed, translated by an LLM, then spoken back as natural-sounding audio.

STT (Speech-to-Text) — Deepgram for real-time transcription in the speaker's language
LLM (Translation) — OpenAI GPT-4, Claude, or other supported models for accurate translation
TTS (Text-to-Speech) — ElevenLabs for natural voice output in the listener's language
Sub-1-second end-to-end latency for conversational flow
Transcripts generated automatically as a side product of STT

Three Language Modes

Control exactly which languages are available in your rooms with flexible mode options.

Any mode — participants can speak or listen in any ISO 639-1 language (AI handles it)
Allowlist mode — restrict to specific languages (e.g., only en, es, fr, de)
Blocklist mode — allow everything except specific languages
Separate controls for spoken vs. listen languages
Per-participant language selection — each user picks their own

Translation Config Management

Create named translation configurations with your preferred AI credentials and language settings.

Named configs with nicknames — create, update, and delete via REST API
Link your STT, LLM, and TTS credentials by nickname
Set translationOutputMode for audio or text-only room output
Add aiNoteTakerConfig defaults for reusable AI summaries and transcripts
Per-language voice overrides — choose specific TTS voices for specific languages
Extra config fields for provider-specific parameters
Attach configs to rooms by nickname at room creation

Optional AI Notes & Notes-Only Rooms

Use the same translation runtime to generate meeting summaries and transcripts, with or without translated audio playback.

enableAiNotes adds summaries and transcript artifacts to translated rooms
aiNotesOnly runs note capture without presenting translated audio playback to participants
Notes-only rooms normalize to text-only output while keeping the translation runtime available
Reusable note behavior belongs on the Translation Config via aiNoteTakerConfig
Public room and event-setting POST routes accept the optional room flags

Flexible Credential Options

Use your own AI provider accounts or the MediaSFU system pool — no setup friction for getting started.

System pool — turn on "Use System Translation Configs" in Lite Dashboard Settings or Dashboard Settings.
Own credentials — bring your own Deepgram, OpenAI, and ElevenLabs accounts
System pool uses credits — STT, LLM, and TTS usage deducted from your balance
Own credentials — provider charges apply directly to your accounts
Top up system credits via the dashboard with credit packages

Room Integration & Constraints

Translation integrates directly into MediaSFU meeting and voice rooms.

Enable via room creation — set supportTranslation: true and attach your config
Add enableAiNotes for summaries, or aiNotesOnly for transcripts and notes without translated playback
Room capacity constraints — supportMaxRoom and supportFlexRoom disabled when translation is active
Works with both video meetings and voice-only SIP calls
Translation settings persist for the room duration
System translation config uses reserved nicknames (MediaSFUSystemTranslationConfig)

Voice Quality & Customization

Natural-sounding output with per-language voice selection.

ElevenLabs voices — high-quality, natural-sounding speech synthesis
Per-language overrides — assign specific voices to specific languages
Voice stability and similarity settings for fine-tuning
Multiple voice options per language for gender and tone variety
50+ curated languages tested for accuracy — more available in "any" mode

Usage Scenarios

Real-world workflows, step by step

Setting Up Translation for a Global Team Meeting

Create a translation config and host a meeting where each participant speaks their native language.

Create AI credentials — In the dashboard, add your Deepgram (STT), OpenAI (LLM), and ElevenLabs (TTS) credentials

Create a Translation Config — POST to /v1/translationconfigs with your credential nicknames, language mode ("any" for global teams), output mode, and optional AI note-taker defaults

Create a room with translation — Set supportTranslation: true and translationConfigNickName in the room creation request. Add enableAiNotes for summaries, or aiNotesOnly for notes without translated playback.

Participants join and select languages — Each user picks their spoken and listen language from the available options

Speak naturally — Speech is transcribed, translated, and spoken back in each listener's language in under 1 second

Using the System Pool (No AI Setup Required)

Start translating immediately using MediaSFU's system AI credentials — no provider accounts needed.

Open Settings → System Credits — Turn on "Use System Translation Configs" in Lite Settings (or Dashboard Settings for developer accounts).

Top up credits — Add credits via the Top Up page — they cover STT, LLM, and TTS usage

Create a room — System translation config (MediaSFUSystemTranslationConfig) is automatically available

Translation is active — Participants select languages and speak — credits are deducted per usage

Monitor balance — Check your credit balance in Settings — separate tracking for translation credits

Restricting Languages for a Compliance Use Case

A healthcare provider needs to ensure only medically-validated languages are available in telehealth sessions.

Create a config with allowlist mode — Set listenLanguageMode: "allowlist" with only validated languages (e.g., en, es, fr)

Or use blocklist mode — Set spokenLanguageMode: "blocklist" with blockedSpokenLanguages to exclude specific languages

Attach to room — Rooms created with this config only offer the permitted languages to participants

Patients select from curated list — No risk of unsupported language selection — only validated options appear

Simple Pricing

BYO AIprovider costs stay direct

Use your own STT, LLM, and TTS credentials where supported, or use the MediaSFU system pool with credits.

50+ curated languages
Three language modes (any/allowlist/blocklist)
Natural ElevenLabs voice output
Per-participant language selection
Optional AI Notes and notes-only room mode
Transcripts included automatically

View Full Pricing

Works With

Deepgram

OpenAI

Anthropic

ElevenLabs

Google Cloud

Azure Cognitive

AWS Transcribe

Part of These Solutions

For Global Teams For Learning For Care Workflows

Ready to add live voice translation?

Try the meeting demo first, then configure translation or start free.

Try Translation Demo Start Free