# Chat Handler

Text-based chat pipeline for the DaviesTechLabs AI/ML platform.

## Overview

A NATS-based service that handles chat completion requests with RAG (Retrieval Augmented Generation).

**Pipeline:** Query → Embeddings → Milvus → Rerank → LLM → (optional TTS)

## Versions

| File | Description |
|------|-------------|
| `chat_handler.py` | Standalone implementation (v1) |
| `chat_handler_v2.py` | Uses handler-base library (recommended) |
| `Dockerfile` | Standalone image |
| `Dockerfile.v2` | Handler-base image |

## Architecture

```
NATS (ai.chat.request)
        │
        ▼
┌───────────────────┐
│   Chat Handler    │
└───────────────────┘
        │
        ├──▶ BGE Embeddings (drizzt)
        │         │
        │         ▼
        ├──▶ Milvus Vector Search
        │         │
        │         ▼
        ├──▶ BGE Reranker (danilo)
        │         │
        │         ▼
        ├──▶ vLLM (khelben)
        │         │
        │         ▼ (optional)
        └──▶ XTTS TTS (elminster)
                  │
                  ▼
         NATS (ai.chat.response.{id})
```

## NATS Message Format

### Request (ai.chat.request)

```json
{
  "request_id": "uuid",
  "query": "What is the capital of France?",
  "collection": "knowledge_base",
  "enable_tts": false,
  "system_prompt": "Optional custom system prompt"
}
```

### Response (ai.chat.response.{request_id})

```json
{
  "request_id": "uuid",
  "response": "The capital of France is Paris.",
  "sources": [
    {"text": "Paris is the capital...", "score": 0.95}
  ],
  "audio": "base64-encoded-audio (if TTS enabled)"
}
```

## Configuration

| Environment Variable | Default | Description |
|---------------------|---------|-------------|
| `NATS_URL` | `nats://nats.ai-ml.svc.cluster.local:4222` | NATS server |
| `EMBEDDINGS_URL` | `http://embeddings-predictor.ai-ml.svc.cluster.local` | Embeddings |
| `RERANKER_URL` | `http://reranker-predictor.ai-ml.svc.cluster.local` | Reranker |
| `VLLM_URL` | `http://llm-draft.ai-ml.svc.cluster.local:8000` | LLM service |
| `TTS_URL` | `http://tts-predictor.ai-ml.svc.cluster.local` | TTS (optional) |
| `MILVUS_HOST` | `milvus.ai-ml.svc.cluster.local` | Vector DB |
| `COLLECTION_NAME` | `knowledge_base` | Default Milvus collection |
| `ENABLE_TTS` | `false` | Enable audio responses |

## Building

```bash
# Standalone image (v1)
docker build -f Dockerfile -t chat-handler:latest .

# Handler-base image (v2 - recommended)
docker build -f Dockerfile.v2 -t chat-handler:v2 .
```

## Dependencies

The v2 handler depends on [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base):

```bash
pip install git+https://git.daviestechlabs.io/daviestechlabs/handler-base.git
```

## Related

- [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) - Base handler library
- [voice-assistant](https://git.daviestechlabs.io/daviestechlabs/voice-assistant) - Voice pipeline
- [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs