feat: Add chat handler with RAG pipeline

- chat_handler.py: Standalone NATS handler with RAG - chat_handler_v2.py: Handler-base implementation - Dockerfiles for both versions Pipeline: Embeddings → Milvus → Rerank → LLM → (optional TTS)
2026-02-01 20:37:34 -05:00
parent cf859ead4e
commit 6ef42b3d2c
7 changed files with 1290 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -1,2 +1,110 @@
-# chat-handler
+# Chat Handler

+Text-based chat pipeline for the DaviesTechLabs AI/ML platform.
+
+## Overview
+
+A NATS-based service that handles chat completion requests with RAG (Retrieval Augmented Generation).
+
+**Pipeline:** Query → Embeddings → Milvus → Rerank → LLM → (optional TTS)
+
+## Versions
+
+| File | Description |
+|------|-------------|
+| `chat_handler.py` | Standalone implementation (v1) |
+| `chat_handler_v2.py` | Uses handler-base library (recommended) |
+| `Dockerfile` | Standalone image |
+| `Dockerfile.v2` | Handler-base image |
+
+## Architecture
+
+```
+NATS (ai.chat.request)
+        │
+        ▼
+┌───────────────────┐
+│   Chat Handler    │
+└───────────────────┘
+        │
+        ├──▶ BGE Embeddings (drizzt)
+        │         │
+        │         ▼
+        ├──▶ Milvus Vector Search
+        │         │
+        │         ▼
+        ├──▶ BGE Reranker (danilo)
+        │         │
+        │         ▼
+        ├──▶ vLLM (khelben)
+        │         │
+        │         ▼ (optional)
+        └──▶ XTTS TTS (elminster)
+                  │
+                  ▼
+         NATS (ai.chat.response.{id})
+```
+
+## NATS Message Format
+
+### Request (ai.chat.request)
+
+```json
+{
+  "request_id": "uuid",
+  "query": "What is the capital of France?",
+  "collection": "knowledge_base",
+  "enable_tts": false,
+  "system_prompt": "Optional custom system prompt"
+}
+```
+
+### Response (ai.chat.response.{request_id})
+
+```json
+{
+  "request_id": "uuid",
+  "response": "The capital of France is Paris.",
+  "sources": [
+    {"text": "Paris is the capital...", "score": 0.95}
+  ],
+  "audio": "base64-encoded-audio (if TTS enabled)"
+}
+```
+
+## Configuration
+
+| Environment Variable | Default | Description |
+|---------------------|---------|-------------|
+| `NATS_URL` | `nats://nats.ai-ml.svc.cluster.local:4222` | NATS server |
+| `EMBEDDINGS_URL` | `http://embeddings-predictor.ai-ml.svc.cluster.local` | Embeddings |
+| `RERANKER_URL` | `http://reranker-predictor.ai-ml.svc.cluster.local` | Reranker |
+| `VLLM_URL` | `http://llm-draft.ai-ml.svc.cluster.local:8000` | LLM service |
+| `TTS_URL` | `http://tts-predictor.ai-ml.svc.cluster.local` | TTS (optional) |
+| `MILVUS_HOST` | `milvus.ai-ml.svc.cluster.local` | Vector DB |
+| `COLLECTION_NAME` | `knowledge_base` | Default Milvus collection |
+| `ENABLE_TTS` | `false` | Enable audio responses |
+
+## Building
+
+```bash
+# Standalone image (v1)
+docker build -f Dockerfile -t chat-handler:latest .
+
+# Handler-base image (v2 - recommended)
+docker build -f Dockerfile.v2 -t chat-handler:v2 .
+```
+
+## Dependencies
+
+The v2 handler depends on [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base):
+
+```bash
+pip install git+https://git.daviestechlabs.io/daviestechlabs/handler-base.git
+```
+
+## Related
+
+- [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) - Base handler library
+- [voice-assistant](https://git.daviestechlabs.io/daviestechlabs/voice-assistant) - Voice pipeline
+- [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs