Building a Czech-language outbound market research survey bot on ElevenLabs Conversational AI.
Current setup:
- STT: Scribe v2.2 Realtime model
- LLM: Qwen3.6-35B-A3B model, As per the 11labs portal, this has the lowest latency
- TTS: V3 Conversational
- Use case: automated survey calls in CZECH
Issues we're facing:
- Initial delay — noticeable lag at the start of the call before the bot speaks. How do we reduce this? We want users to start a conversation by saying hi/hello, and then the bot introduces itself. Here, we are seeing some delay.
- Overall latency — response time between user input and bot speaking feels slow compared to competitors like Sesame
- Robotic feel — even with Conversational AI (which should handle naturalness), the voice still sounds robotic
We're already using Conversational AI for TTS, so I'm not sure what else we can tune to make it less robotic without sacrificing latency further.
Question: What else can we adjust — voice settings, LLM tier, prompt structure, or configuration — to reduce latency and improve naturalness while keeping the bot sounding human?
I'll attach a sample recording link so you can hear the exact issues.
Been stuck on this for a while and could really use some outside perspective. If anyone's deployed something similar in Czech (or any non-English language), would love to hear what worked for you. Thanks so much