Tutorial

Voice Mode (Talk Mode)

Have natural voice conversations with OpenClaw using ElevenLabs TTS.

🎙️ What is Talk Mode?

Talk Mode enables natural voice conversations with OpenClaw. Speak naturally, and your AI responds with lifelike speech powered by ElevenLabs.

How it works:

Listen for wake word or push-to-talk→Transcribe speech to text (Whisper)→Process with AI (Claude, GPT, etc.)→Convert response to speech (ElevenLabs)

Talk Mode requires an ElevenLabs API key for text-to-speech. Speech-to-text uses Whisper (OpenAI or local).

Requirements

ElevenLabs API Key

Required

Platform

macOS, iOS, Android

Voice input requires native apps

Permissions

Microphone access

Grant when prompted

Setup Steps

Get ElevenLabs API Key

• Go to elevenlabs.io and create an account
• Navigate to Profile → API Key
• Copy your API key

Configure OpenClaw

Add ElevenLabs configuration to your openclaw.json:

{
  "talk": {
    "voiceId": "EXAVITQu4vr4xnSDxMaL",
    "modelId": "eleven_v3",
    "outputFormat": "mp3_44100_128",
    "apiKey": "${ELEVENLABS_API_KEY}",
    "interruptOnSpeech": true
  }
}

Set Environment Variable

Alternatively, set your API key as an environment variable:

export ELEVENLABS_API_KEY="your_api_key_here"

Start Talk Mode

Enable talk mode from the OpenClaw menu bar app or CLI:

• Click the OpenClaw menu bar icon
• Select 'Start Talk Mode'
• Or run: openclaw talk

Full Configuration Options

All available voice configuration options:

{
  "talk": {
    "voiceId": "EXAVITQu4vr4xnSDxMaL",
    "modelId": "eleven_v3",
    "outputFormat": "mp3_44100_128",
    "apiKey": "${ELEVENLABS_API_KEY}",
    "interruptOnSpeech": true,
    "stability": 0.5,
    "similarityBoost": 0.75,
    "style": 0.5,
    "speakerBoost": true
  }
}

elevenlabs.apiKey — Your ElevenLabs API key
elevenlabs.voiceId — Voice ID to use (default: Rachel)
elevenlabs.model — Model to use (eleven_monolingual_v1, eleven_multilingual_v2)
voice.wakeWord — Wake word to activate (default: 'Hey Claw')
voice.pushToTalk — Use push-to-talk instead of wake word
voice.silenceTimeout — Seconds of silence before stopping (default: 2)

Voice Aliases

Switch between different voice personalities easily.

{
  "talk": {
    "voiceId": "default",
    "voices": {
      "default": "EXAVITQu4vr4xnSDxMaL",
      "professional": "21m00Tcm4TlvDq8ikWAM",
      "friendly": "AZnzlk1XvdvUeBnXmlld",
      "narrator": "pNInz6obpgDQGcFmaJgB"
    }
  }
}

Available Voices

Default (Rachel)

Warm, natural female voice

Professional (Adam)

Clear, authoritative male voice

Friendly (Bella)

Casual, approachable female voice

Narrator (Antoni)

Deep, storytelling male voice

Switch voices by saying 'Use professional voice' or setting in config.

Platform Features

macOS

✓Menu bar app with quick toggle
✓Global hotkey for push-to-talk
✓System audio integration
✓Wake word detection

iOS & Android

✓Voice input in companion app
✓Background wake word detection
✓Bluetooth headset support
✓Haptic feedback

Voice Directives

Control OpenClaw with voice commands:

// Per-reply voice control
{
  "voice": "narrator",
  "speed": 1.1,
  "stability": 0.8
}

This response will be spoken in the narrator voice at slightly faster speed.

Available Commands

Stop — Stop current speech playback
Pause — Pause and wait for more input
Cancel — Cancel current request
Repeat — Repeat the last response
Slower/Faster — Adjust speech speed

TTS for Messages

Configure text-to-speech for incoming messages:

{
  "tts": {
    "enabled": true,
    "mode": "tagged",
    "provider": "elevenlabs",
    "voiceId": "EXAVITQu4vr4xnSDxMaL"
  }
}

TTS Modes

always

Read all messages aloud

Best for: Hands-free operation

inbound

Read only incoming messages

Best for: When sending via other channels

tagged

Read messages tagged with @voice

Best for: Selective voice output

Supported Providers

ElevenLabs — ElevenLabs (highest quality)
OpenAI — OpenAI TTS (fast, good quality)

💡 Tips & Best Practices

•Quiet Environment — Voice recognition works best in quiet environments with minimal background noise.
•Speak Clearly — Speak at a normal pace. Pausing slightly between sentences helps transcription accuracy.
•Use Headphones — Headphones prevent echo and improve wake word detection.
•Check Credits — ElevenLabs has usage limits. Monitor your credits to avoid interruptions.

Voice Mode Ready!

Start talking to your AI assistant hands-free.

Install OpenClaw Explore Skills