> ## Documentation Index
> Fetch the complete documentation index at: https://docs.okchat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Voice Agent

> Learn how to enable voice-based interactions with your chatbot using the AI Voice Agent feature in OKChat AI.

<Frame>
  <img src="https://mintcdn.com/dasomx/HWv_yUHwWK99k7RJ/images/logo/mascot-gradient.svg?fit=max&auto=format&n=HWv_yUHwWK99k7RJ&q=85&s=9d16e3e6945ca532464ee9a3bc8a6bd1" alt="OKChat AI Voice Agent Mascot" style={{ width: "120px", margin: "0 auto" }} width="162" height="171" data-path="images/logo/mascot-gradient.svg" />
</Frame>

<Info>
  <b>Preview Feature:</b> The AI Voice Agent is currently in preview. For
  access, please reach out to{" "}
  <a href="mailto:info@dasomx.com">[info@dasomx.com](mailto:info@dasomx.com)</a>.
</Info>

## Introduction

The AI Voice Agent is an advanced feature in OKChat AI that enables voice-based interactions with your agent. This feature allows users to engage with the agent using natural language, making it ideal for hands-free operations and voice-activated devices.

***

## Key Features

<CardGroup cols={3}>
  <Card title="Voice Interaction" icon="microphone">
    Users can speak to the chatbot, and the AI Voice Agent will respond with
    voice output.
  </Card>

  <Card title="Advanced Turn Detection" icon="language">
    Detects when a user is speaking and supports English, Multilingual, and
    Push-to-Talk detection.
  </Card>

  <Card title="Customizable Voice Widget" icon="sliders">
    Customize the appearance, text, voice, and behavior of the voice widget to
    match your brand.
  </Card>

  <Card title="Real-Time Speech Recognition" icon="waveform">
    Uses advanced speech recognition to understand and process user commands.
  </Card>

  <Card title="Flexible Provider Selection" icon="puzzle">
    Choose from a wide range of LLM, Text-to-Speech (TTS), and Speech-to-Text
    (STT) providers.
  </Card>

  <Card title="Voice Activity Detection (VAD)" icon="volume">
    Uses voice activity detection to process user commands efficiently.
  </Card>

  <Card title="Telephony Integration" icon="phone">
    Integrate your voice agent with phone systems for seamless communication.
  </Card>

  <Card title="Default Tools" icon="toolbox">
    Web search, URL scraping, weather, and knowledge base search are available
    by default.
  </Card>
</CardGroup>

***

## Providers

OKChat AI's Voice Agent supports multiple providers to give you maximum flexibility.

<CardGroup cols={3}>
  <Card title="OpenAI Realtime" icon="robot">
    Uses OpenAI's realtime API with OKChat's knowledge base and tools for fast
    interactions.

    <br />

    <b>Features:</b> Various voices, keyword detection, voice vibe.
  </Card>

  <Card title="Gemini Live" icon="google">
    Uses Google's Gemini API for a seamless voice experience.

    <br />

    <b>Features:</b> Various voices, keyword detection, voice vibe.
  </Card>

  <Card title="OKCHAT Provider (Recommended)" icon="star">
    The most flexible option, using OKChat's own pipeline, optimized for the
    platform. It allows you to mix and match different providers for core
    functionalities.

    <br />

    <b>LLM Providers:</b> OpenAI, Google, Groq, and more.

    <br />

    <b>TTS Providers:</b> OpenAI, Google, ElevenLabs, Deepgram, Cartesia.

    <br />

    <b>STT Providers:</b> OpenAI, Deepgram, Speechmatics, Groq, Google.
  </Card>
</CardGroup>

***

## Getting Started with AI Voice Agent

## Getting Started Steps

### 1. Accessing the AI Voice Agent

Log in to your OKChat AI dashboard.

Navigate to the **AI Voice Agent** section under **Integration**.

If you do not see this option, contact [info@dasomx.com](mailto:info@dasomx.com) to request access.

### 2. General Configuration

In the **General** tab, you can configure the core behavior of your voice agent.

* **Prompt:** Set the system prompt to define the agent's personality and instructions.
* **Voice Provider:** Choose between **OKCHAT (Recommended)**, **OpenAI Realtime**, or **Gemini Live**. The OKCHAT provider offers the most customization options.
* **LLM Provider (OKCHAT only):** If you chose the OKCHAT provider, select an underlying Large Language Model (LLM) provider like OpenAI, Google, or Groq.
* **Model:** Choose the specific model for the selected provider (e.g., `gpt-4o`, `gemini-2.0-flash`).
* **Turn Detector:** Select how the agent detects user speech: **English**, **Multilingual**, or **Push To Talk** (deprecated).
* **TTS Provider (OKCHAT only):** Choose a Text-to-Speech provider like OpenAI, Google, ElevenLabs, Deepgram, or Cartesia to generate the agent's voice.
* **Choose AI Voice:** Select a specific voice from the chosen TTS provider's library.
* **STT Provider:** Select a Speech-to-Text provider like OpenAI, Deepgram, Speechmatics, Groq, or Google for transcription.
* **STT Model:** Choose the specific transcription model.
* **Language:** Enable **Auto-detect language** or manually specify language codes (e.g., `en` for ISO 639-1, `en-US` for BCP-47, depending on the STT provider).
* **Keywords:** Provide a comma-separated list of domain-specific keywords to improve speech recognition accuracy.
* **Greeting Message:** Set the initial message the agent speaks when a conversation starts.
* **Voice Vibe:** Describe the desired personality, tone, and speaking style for the AI voice, particularly for OpenAI voices.

### 3. Appearance, Text, and Avatar

Customize the visual aspects of your voice widget.

* **Appearance Tab:** Adjust colors (background, text, button), radius for cards and buttons, and the widget's on-screen **position** (`bottom-right`, `top-left`, etc.). You can also choose to hide the "Powered by OKCHAT.AI" watermark.
* **Text Tab:** Customize the text for buttons and status indicators like "Start call" or "Listening...".
* **Avatar Tab:** Upload a custom image to be used as the agent's avatar.

### 4. Advanced Configuration

Fine-tune the agent's performance from the **Config** tab.

* **VAD Threshold:** Adjust VAD sensitivity (0-1). Lower is more sensitive. Default: 0.5.
* **Prefix Padding (ms):** Minimum speech duration to start a chunk. Default: 500ms.
* **Silence Duration (ms):** Minimum silence before ending a segment. Default: 1000ms.
* **Temperature:** Adjust response randomness (0-1). Higher is more random. Default: 0.7.
* **Max Output Tokens:** Limit response length. Default: 2048 tokens.

### 5. Kiosk Mode

Configure a full-screen voice agent experience, ideal for public displays.

* Enable **Kiosk Mode** from the **Kiosk** tab.
* Customize the **Onboarding Screen** with a title, description, instructions, and brand colors.
* Upload a mascot image and set the button text.
* Use the provided **Kiosk URL** to launch the agent in a browser.

### 6. Telephony Integration

Connect your voice agent to phone lines from the **Telephony** tab. This allows users to call in and interact with your AI agent over the phone.

### 7. Embedding the Voice Widget

Copy the embed code from the bottom of the configuration page and paste it into your website's HTML. The `data-position` attribute can be configured in the Appearance tab.

```html theme={null}
<script
  src="https://v2.okchat.ai/chatbot-voice-widget.js"
  data-chatbot-id="YOUR_CHATBOT_ID"
  data-position="bottom-right"
></script>
```

***

## Use Cases

<CardGroup cols={3}>
  <Card title="Customer Support" icon="headset">
    Provide voice-based customer support for users who prefer speaking over
    typing.
  </Card>

  <Card title="Hands-Free Interaction" icon="car">
    Enable voice commands for users in environments where typing is inconvenient
    (e.g., driving, cooking).
  </Card>

  <Card title="Accessibility" icon="universal-access">
    Improve accessibility for users with disabilities by offering a voice-based
    interface.
  </Card>
</CardGroup>

***

## Best Practices

<AccordionGroup>
  <Accordion title="Test Thoroughly">
    Test the voice widget in different environments to ensure accurate speech
    recognition.
  </Accordion>

  <Accordion title="Optimize VAD Settings">
    Adjust the VAD threshold, prefix padding, and silence duration to match your
    use case.
  </Accordion>

  <Accordion title="Monitor Performance">
    Regularly review the chatbot's responses and adjust the temperature and max
    output tokens as needed.
  </Accordion>

  <Accordion title="User Guidance">
    Provide clear instructions to users on how to interact with the voice
    widget.
  </Accordion>
</AccordionGroup>

***

## Troubleshooting

<AccordionGroup>
  <Accordion title="Voice Widget Not Responding">
    <ul>
      <li>Ensure the embed code is correctly placed in your website's HTML.</li>

      <li>
        Check the VAD settings to ensure the widget is sensitive enough to
        detect speech.
      </li>
    </ul>
  </Accordion>

  <Accordion title="Inaccurate Responses">
    <ul>
      <li>
        Adjust the temperature setting to control the randomness of responses.
      </li>

      <li>
        Review the max output tokens to ensure responses are not too long or too
        short.
      </li>
    </ul>
  </Accordion>

  <Accordion title="Access Issues">
    <ul>
      <li>
        If you do not see the AI Voice Agent option in your dashboard, contact{" "}
        <a href="mailto:info@dasomx.com">[info@dasomx.com](mailto:info@dasomx.com)</a> to request access.
      </li>
    </ul>
  </Accordion>
</AccordionGroup>

***

## Conclusion

<Info>
  The AI Voice Agent is a powerful preview feature that brings voice-based
  interaction to your OKChat AI chatbot. By customizing the voice widget and
  configuring the settings, you can create a seamless and engaging experience
  for your users. For access to this feature, reach out to{" "}
  <a href="mailto:info@dasomx.com">[info@dasomx.com](mailto:info@dasomx.com)</a>.
</Info>

For further assistance, refer to the OKChat AI support resources or contact our support team.
