---
title: AI Voice Chat
emoji: 🎙️
colorFrom: green
colorTo: blue
sdk: static
pinned: false
license: mit
short_description: 100% in-browser, hands-free AI voice chat
arxiv: "2503.23108"
---

# AI Voice Chat

**A 100% in-browser solution for hands-free AI voice chat.** No API keys, no server, no data leaves your device. Uses Silero VAD, Whisper STT, WebLLM (Qwen 1.5B), and Supertonic TTS - all running locally via WebGPU.

**Swap in your own LLM** - The built-in model is just a demo. The real value is the voice pipeline. Point it at Claude, GPT-4, Ollama, or any LLM in ~10 lines of code.

## How It Works

1. **Click the green phone button** to start a call
2. **Speak naturally** - it detects when you're talking
3. **Wait for the response** - the AI thinks and speaks back
4. **Click the red button** to end the call

## What's Running Locally

| Component | Model | Purpose |
|-----------|-------|---------|
| 🎤 Speech-to-Text | Whisper | Converts your voice to text |
| 🧠 LLM | Qwen 1.5B | Generates responses (swappable) |
| 🔊 Text-to-Speech | Supertonic | Speaks the response |
| 👂 Voice Detection | Silero VAD | Knows when you're talking |

All models download once (~1GB) and are cached in your browser.

## Requirements

- **Browser**: Chrome 113+ or Edge 113+ (needs WebGPU)
- **RAM**: ~4GB available
- **Microphone**: Click "Allow" when prompted

## Controls

- 🎤 **Mic button** - Mute/unmute your microphone
- 🔊 **Speaker button** - Mute/unmute the AI voice
- 📢 **Voice selector** - Choose from 10 voices (F1-F5, M1-M5)
- 📞 **Phone button** - Start/end the call

## Privacy

**100% local.** Your voice is processed in your browser. Nothing is sent to any server. The only network requests are to download the AI models (once, then cached).

## Source Code

Want to run this yourself or swap in a different LLM?

👉 [GitHub Repository](https://github.com/iRelate-AI/voice-chat)

Built with Next.js, WebLLM, Transformers.js, and Supertonic TTS.