AI Voice Agent
Learn how to enable voice-based interactions with your chatbot using the AI Voice Agent feature in OKChat AI.
Preview Feature: The AI Voice Agent is currently in preview. For access, please reach out to [email protected].
Introduction
The AI Voice Agent is an advanced feature in OKChat AI that enables voice-based interactions with your agent. This feature allows users to engage with the agent using natural language, making it ideal for hands-free operations and voice-activated devices.
Key Features
Voice Interaction
Users can speak to the chatbot, and the AI Voice Agent will respond with voice output.
Advanced Turn Detection
Detects when a user is speaking and supports both English and Multilingual turn detection.
Customizable Voice Widget
Customize the appearance, voice, and behavior of the voice widget to match your brand.
Real-Time Speech Recognition
Uses advanced speech recognition to understand and process user commands.
Dynamic Responses
Provides real-time, context-aware responses based on user input.
Voice Activity Detection (VAD)
Uses voice activity detection to process user commands efficiently.
Custom Greetings
Set custom greetings for the AI Voice Agent.
Default Tools
Web search, URL scraping, weather, and knowledge base search are available by default.
Providers
OpenAI Realtime
Uses OpenAI’s realtime API with OKChat’s knowledge base and tools.
Features: Different voices, keyword detection, voice vibe.
Gemini Live
Uses Google’s Gemini API for seamless voice experience.
Features: Different voices, keyword detection, voice vibe.
OKCHAT Provider (Recommended)
Uses OKChat’s own pipeline, optimized for the platform.
Features: Different voices, LLM model selection, TTS/STT model selection, keyword detection, voice vibe.
Getting Started with AI Voice Agent
Accessing the AI Voice Agent
Log in to your OKChat AI dashboard.
Navigate to the AI Voice Agent section under Integration.
If you do not see this option, contact [email protected] to request access.
Configuring the Voice Agent
System Prompt: Set the system prompt for the AI Voice Agent.
Provider: Choose the provider for the AI Voice Agent.
Model: Choose the model for the AI Voice Agent.
Turn Detection: Choose the turn detection method.
STT/TTS Model: Choose the STT/TTS model for the AI Voice Agent.
Language: Choose the language for the AI Voice Agent or Auto to detect the language automatically.
Voice: Select the voice type and tone from various providers.
Keyword Detection: Set keywords for the AI Voice Agent to listen to.
Voice Vibe: Choose the voice vibe.
Greetings: Set custom greetings.
Avatar, Text, Appearance: Customize the widget’s look and feel.
Voice Activity Detection (VAD) Settings
VAD Threshold: Adjust sensitivity (default: 0.5).
Prefix Padding (ms): Minimum speech duration to start a chunk (default: 500ms).
Silence Duration (ms): Minimum silence before ending a segment (default: 1000ms).
Response Settings
Temperature: Adjust response randomness (default: 0.7).
Max Output Tokens: Limit response length (default: 2048 tokens).
Embedding the Voice Widget
Copy the embed code from the widget configuration and paste it into your website’s HTML.
Use Cases
Customer Support
Provide voice-based customer support for users who prefer speaking over typing.
Hands-Free Interaction
Enable voice commands for users in environments where typing is inconvenient (e.g., driving, cooking).
Accessibility
Improve accessibility for users with disabilities by offering a voice-based interface.
Best Practices
Troubleshooting
Conclusion
The AI Voice Agent is a powerful preview feature that brings voice-based interaction to your OKChat AI chatbot. By customizing the voice widget and configuring the settings, you can create a seamless and engaging experience for your users. For access to this feature, reach out to [email protected].
For further assistance, refer to the OKChat AI support resources or contact our support team.