# Multilingual Pipecat Voice Agent

A real-time multilingual voice agent built with **Pipecat**, supporting 10 languages with auto-detection. Features function calling for image generation, video generation, web search, and shopping.

## Tech Stack

| Component | Service |
|-----------|---------|
| Transport | Daily.co (WebRTC) |
| STT | Soniox (multilingual, 60+ languages) |
| LLM | Google Gemini 2.5 Flash |
| TTS | ElevenLabs (multilingual) |
| Image Gen | Replicate (google/nano-banana-pro) |
| Video Gen | Replicate (wan-video/wan-2.2-i2v-fast) |
| Search | Gemini built-in Google Search grounding |
| Shopping | Gemini built-in Google Search grounding |

## Setup

### Prerequisites

- Python 3.10 or later
- `uv` package manager

### Install and run

```bash
# Clone and enter the project
cd agent_all

# Configure API keys
cp .env.example .env
# Fill in all API keys in .env

# Install dependencies
uv sync

# Run locally
uv run bot.py
```

Open `http://localhost:7860` in your browser and click Connect.

### Deploy to Pipecat Cloud

```bash
# Install CLI
uv tool install pipecat-ai-cli

# Login
pipecat cloud auth login

# Upload secrets
pipecat cloud secrets set agent-all-secrets --file .env

# Build and push Docker image
pipecat cloud docker build-push

# Deploy
pipecat cloud deploy
```

## Supported Languages

English, Hindi, Telugu, Kannada, Tamil, Bengali, Marathi, Gujarati, Malayalam, Punjabi.

The agent auto-detects the spoken language and responds in the same language.

## Capabilities

- **Image Generation**: "Generate an image of a sunset over mountains"
- **Video Generation**: "Create a video of a cat playing piano"
- **Web Search**: "Search for the latest news about AI" (uses Gemini Google Search)
- **Shopping**: "Find me a good laptop under 1000 dollars" (uses Gemini Google Search)

## API Keys Required

| Service | Env Variable | Sign up |
|---------|-------------|---------|
| Daily.co | `DAILY_API_KEY` | [dashboard.daily.co](https://dashboard.daily.co/) |
| Soniox | `SONIOX_API_KEY` | [console.soniox.com](https://console.soniox.com/) |
| Google Gemini | `GOOGLE_API_KEY` | [aistudio.google.com](https://aistudio.google.com/) |
| ElevenLabs | `ELEVENLABS_API_KEY` | [elevenlabs.io](https://elevenlabs.io/) |
| ElevenLabs | `ELEVENLABS_VOICE_ID` | [Voice Library](https://elevenlabs.io/voice-library) |
| Replicate | `REPLICATE_API_TOKEN` | [replicate.com](https://replicate.com/) |
