Commit Graph

5 Commits

Author SHA1 Message Date
e2176331c8 feat: migrate from msgpack to protobuf (handler-base v1.0.0)
Some checks failed
CI / Lint (push) Failing after 2m49s
CI / Test (push) Successful in 3m36s
CI / Notify (push) Has been cancelled
CI / Docker Build & Push (push) Has been cancelled
CI / Release (push) Has been cancelled
- Replace msgpack encoding with protobuf wire format
- Update field names to proto convention (UserId, RequestId, EnableRag, etc.)
- Use messages.EffectiveQuery() standalone function
- Cast TopK to int32 for proto compatibility
- Rewrite tests for proto round-trips
2026-02-21 15:30:04 -05:00
87d0545d2c feat: replace fake streaming with real SSE StreamGenerate
Some checks failed
CI / Lint (push) Successful in 3m0s
CI / Test (push) Successful in 3m23s
CI / Docker Build & Push (push) Failing after 4m55s
CI / Release (push) Successful in 1m4s
CI / Notify (push) Successful in 1s
Use handler-base StreamGenerate() to publish real token-by-token
ChatStreamChunk messages to NATS as they arrive from Ray Serve,
instead of calling Generate() and splitting into 4-word chunks.

Add 8 streaming tests: happy path, system prompt, RAG context,
nil callback, timeout, HTTP error, context canceled, fallback.
2026-02-21 09:23:57 -05:00
4175e2070c feat: migrate to typed messages, drop base64
Some checks failed
CI / Lint (pull_request) Failing after 58s
CI / Test (pull_request) Failing after 1m22s
CI / Notify (pull_request) Successful in 1s
CI / Release (pull_request) Has been skipped
CI / Docker Build & Push (pull_request) Has been skipped
- Switch OnMessage → OnTypedMessage with natsutil.Decode[messages.ChatRequest]
- Return *messages.ChatResponse / *messages.ChatStreamChunk (not map[string]any)
- Audio as raw []byte in msgpack (25% wire savings vs base64)
- Remove strVal/boolVal/intVal helpers
- Add .dockerignore, GOAMD64=v3 in Dockerfile
- Update tests for typed structs (9 tests pass)
2026-02-20 07:10:43 -05:00
609b44de83 feat: add e2e tests + benchmarks, fix config API
- e2e_test.go: full pipeline tests (LLM-only, RAG, TTS, timeout)
- main.go: fix config field->method references (EmbeddingsURL() etc.)
- Benchmarks: LLMOnly 136µs/op, RAGFlow 496µs/op
2026-02-20 06:45:21 -05:00
adcdb87b9a feat: rewrite chat-handler in Go
Replace Python chat handler with Go for smaller container images.
Uses handler-base Go module for NATS, health, telemetry, and service clients.

- RAG pipeline: embed → Milvus → rerank → LLM
- Streaming response chunks
- Optional TTS synthesis
- Custom response_subject support for companions-frontend
2026-02-19 17:58:52 -05:00