feat: Add chat handler with RAG pipeline
- chat_handler.py: Standalone NATS handler with RAG - chat_handler_v2.py: Handler-base implementation - Dockerfiles for both versions Pipeline: Embeddings → Milvus → Rerank → LLM → (optional TTS)
This commit is contained in:
110
README.md
110
README.md
@@ -1,2 +1,110 @@
|
||||
# chat-handler
|
||||
# Chat Handler
|
||||
|
||||
Text-based chat pipeline for the DaviesTechLabs AI/ML platform.
|
||||
|
||||
## Overview
|
||||
|
||||
A NATS-based service that handles chat completion requests with RAG (Retrieval Augmented Generation).
|
||||
|
||||
**Pipeline:** Query → Embeddings → Milvus → Rerank → LLM → (optional TTS)
|
||||
|
||||
## Versions
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `chat_handler.py` | Standalone implementation (v1) |
|
||||
| `chat_handler_v2.py` | Uses handler-base library (recommended) |
|
||||
| `Dockerfile` | Standalone image |
|
||||
| `Dockerfile.v2` | Handler-base image |
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
NATS (ai.chat.request)
|
||||
│
|
||||
▼
|
||||
┌───────────────────┐
|
||||
│ Chat Handler │
|
||||
└───────────────────┘
|
||||
│
|
||||
├──▶ BGE Embeddings (drizzt)
|
||||
│ │
|
||||
│ ▼
|
||||
├──▶ Milvus Vector Search
|
||||
│ │
|
||||
│ ▼
|
||||
├──▶ BGE Reranker (danilo)
|
||||
│ │
|
||||
│ ▼
|
||||
├──▶ vLLM (khelben)
|
||||
│ │
|
||||
│ ▼ (optional)
|
||||
└──▶ XTTS TTS (elminster)
|
||||
│
|
||||
▼
|
||||
NATS (ai.chat.response.{id})
|
||||
```
|
||||
|
||||
## NATS Message Format
|
||||
|
||||
### Request (ai.chat.request)
|
||||
|
||||
```json
|
||||
{
|
||||
"request_id": "uuid",
|
||||
"query": "What is the capital of France?",
|
||||
"collection": "knowledge_base",
|
||||
"enable_tts": false,
|
||||
"system_prompt": "Optional custom system prompt"
|
||||
}
|
||||
```
|
||||
|
||||
### Response (ai.chat.response.{request_id})
|
||||
|
||||
```json
|
||||
{
|
||||
"request_id": "uuid",
|
||||
"response": "The capital of France is Paris.",
|
||||
"sources": [
|
||||
{"text": "Paris is the capital...", "score": 0.95}
|
||||
],
|
||||
"audio": "base64-encoded-audio (if TTS enabled)"
|
||||
}
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
| Environment Variable | Default | Description |
|
||||
|---------------------|---------|-------------|
|
||||
| `NATS_URL` | `nats://nats.ai-ml.svc.cluster.local:4222` | NATS server |
|
||||
| `EMBEDDINGS_URL` | `http://embeddings-predictor.ai-ml.svc.cluster.local` | Embeddings |
|
||||
| `RERANKER_URL` | `http://reranker-predictor.ai-ml.svc.cluster.local` | Reranker |
|
||||
| `VLLM_URL` | `http://llm-draft.ai-ml.svc.cluster.local:8000` | LLM service |
|
||||
| `TTS_URL` | `http://tts-predictor.ai-ml.svc.cluster.local` | TTS (optional) |
|
||||
| `MILVUS_HOST` | `milvus.ai-ml.svc.cluster.local` | Vector DB |
|
||||
| `COLLECTION_NAME` | `knowledge_base` | Default Milvus collection |
|
||||
| `ENABLE_TTS` | `false` | Enable audio responses |
|
||||
|
||||
## Building
|
||||
|
||||
```bash
|
||||
# Standalone image (v1)
|
||||
docker build -f Dockerfile -t chat-handler:latest .
|
||||
|
||||
# Handler-base image (v2 - recommended)
|
||||
docker build -f Dockerfile.v2 -t chat-handler:v2 .
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
The v2 handler depends on [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base):
|
||||
|
||||
```bash
|
||||
pip install git+https://git.daviestechlabs.io/daviestechlabs/handler-base.git
|
||||
```
|
||||
|
||||
## Related
|
||||
|
||||
- [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) - Base handler library
|
||||
- [voice-assistant](https://git.daviestechlabs.io/daviestechlabs/voice-assistant) - Voice pipeline
|
||||
- [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs
|
||||
|
||||
Reference in New Issue
Block a user