Use handler-base StreamGenerate() to publish real token-by-token
ChatStreamChunk messages to NATS as they arrive from Ray Serve,
instead of calling Generate() and splitting into 4-word chunks.
Add 8 streaming tests: happy path, system prompt, RAG context,
nil callback, timeout, HTTP error, context canceled, fallback.
Replace Python chat handler with Go for smaller container images.
Uses handler-base Go module for NATS, health, telemetry, and service clients.
- RAG pipeline: embed → Milvus → rerank → LLM
- Streaming response chunks
- Optional TTS synthesis
- Custom response_subject support for companions-frontend