Converse Logo
Voice Services

Vernier STT Engine

Converse uses a high-accuracy speech recognition service optimized for real-time telephony environments, including noisy backgrounds, strong accents, and multilingual conversations.

Language Support

Set the language in the Flow's Start node (for flows) or in the Agent's Language setting. This configures both speech recognition and voice synthesis for that call.

Supported languages include English, Hindi, Tamil, Telugu, Malayalam, Kannada, Marathi, Spanish, French, German, and Arabic. The language setting must match the expected caller language for best accuracy.

Multilingual callers

For callers who mix languages (e.g., English and Hindi), set the language to English (India). This variant handles code-switching (Hinglish) better than either language individually.

How it works in a call

Speech recognition runs as a live stream during the call:

Accuracy considerations

Barge-in

When a caller speaks while the agent is talking, barge-in detection stops the current TTS output and starts processing the new input immediately. This makes conversations feel natural rather than forced-sequential. Barge-in sensitivity is configured per agent in the Advanced settings.

Testing STT

Go to Playground → STT tab to upload an audio file and see the raw transcript. Use the Flow tab with voice mode enabled to test speech recognition end-to-end in your actual flow.