🦞
Tutorial

Voice Mode (Talk Mode)

Have natural voice conversations with OpenClaw using ElevenLabs TTS.

🎙️ What is Talk Mode?

Talk Mode enables natural voice conversations with OpenClaw. Speak naturally, and your AI responds with lifelike speech powered by ElevenLabs.

How it works:

Listen for wake word or push-to-talkTranscribe speech to text (Whisper)Process with AI (Claude, GPT, etc.)Convert response to speech (ElevenLabs)

Talk Mode requires an ElevenLabs API key for text-to-speech. Speech-to-text uses Whisper (OpenAI or local).

Requirements

ElevenLabs API Key

Required

Sign up at elevenlabs.io

Platform

macOS, iOS, Android

Voice input requires native apps

Permissions

Microphone access

Grant when prompted

Setup Steps

1

Get ElevenLabs API Key

Sign up for ElevenLabs and get your API key:

  • Go to elevenlabs.io and create an account
  • Navigate to Profile → API Key
  • Copy your API key
2

Configure OpenClaw

Add ElevenLabs configuration to your openclaw.json:

{
  "talk": {
    "voiceId": "EXAVITQu4vr4xnSDxMaL",
    "modelId": "eleven_v3",
    "outputFormat": "mp3_44100_128",
    "apiKey": "${ELEVENLABS_API_KEY}",
    "interruptOnSpeech": true
  }
}
3

Set Environment Variable

Alternatively, set your API key as an environment variable:

export ELEVENLABS_API_KEY="your_api_key_here"
4

Start Talk Mode

Enable talk mode from the OpenClaw menu bar app or CLI:

  • Click the OpenClaw menu bar icon
  • Select 'Start Talk Mode'
  • Or run: openclaw talk
Full Configuration Options

All available voice configuration options:

{
  "talk": {
    "voiceId": "EXAVITQu4vr4xnSDxMaL",
    "modelId": "eleven_v3",
    "outputFormat": "mp3_44100_128",
    "apiKey": "${ELEVENLABS_API_KEY}",
    "interruptOnSpeech": true,
    "stability": 0.5,
    "similarityBoost": 0.75,
    "style": 0.5,
    "speakerBoost": true
  }
}
  • elevenlabs.apiKeyYour ElevenLabs API key
  • elevenlabs.voiceIdVoice ID to use (default: Rachel)
  • elevenlabs.modelModel to use (eleven_monolingual_v1, eleven_multilingual_v2)
  • voice.wakeWordWake word to activate (default: 'Hey Claw')
  • voice.pushToTalkUse push-to-talk instead of wake word
  • voice.silenceTimeoutSeconds of silence before stopping (default: 2)
Voice Aliases

Switch between different voice personalities easily.

{
  "talk": {
    "voiceId": "default",
    "voices": {
      "default": "EXAVITQu4vr4xnSDxMaL",
      "professional": "21m00Tcm4TlvDq8ikWAM",
      "friendly": "AZnzlk1XvdvUeBnXmlld",
      "narrator": "pNInz6obpgDQGcFmaJgB"
    }
  }
}

Available Voices

Default (Rachel)

Warm, natural female voice

Professional (Adam)

Clear, authoritative male voice

Friendly (Bella)

Casual, approachable female voice

Narrator (Antoni)

Deep, storytelling male voice

Switch voices by saying 'Use professional voice' or setting in config.

Platform Features

macOS
  • Menu bar app with quick toggle
  • Global hotkey for push-to-talk
  • System audio integration
  • Wake word detection
iOS & Android
  • Voice input in companion app
  • Background wake word detection
  • Bluetooth headset support
  • Haptic feedback
Voice Directives

Control OpenClaw with voice commands:

// Per-reply voice control
{
  "voice": "narrator",
  "speed": 1.1,
  "stability": 0.8
}

This response will be spoken in the narrator voice at slightly faster speed.

Available Commands

  • StopStop current speech playback
  • PausePause and wait for more input
  • CancelCancel current request
  • RepeatRepeat the last response
  • Slower/FasterAdjust speech speed
TTS for Messages

Configure text-to-speech for incoming messages:

{
  "tts": {
    "enabled": true,
    "mode": "tagged",
    "provider": "elevenlabs",
    "voiceId": "EXAVITQu4vr4xnSDxMaL"
  }
}

TTS Modes

always

Read all messages aloud

Best for: Hands-free operation

inbound

Read only incoming messages

Best for: When sending via other channels

tagged

Read messages tagged with @voice

Best for: Selective voice output

Supported Providers

  • ElevenLabsElevenLabs (highest quality)
  • OpenAIOpenAI TTS (fast, good quality)
💡 Tips & Best Practices
  • Quiet EnvironmentVoice recognition works best in quiet environments with minimal background noise.
  • Speak ClearlySpeak at a normal pace. Pausing slightly between sentences helps transcription accuracy.
  • Use HeadphonesHeadphones prevent echo and improve wake word detection.
  • Check CreditsElevenLabs has usage limits. Monitor your credits to avoid interruptions.

Voice Mode Ready!

Start talking to your AI assistant hands-free.