|
|
3585d81ff5
|
feat: add StreamGenerate for real SSE streaming from LLM
CI / Lint (push) Failing after 2m44s
CI / Test (push) Successful in 3m7s
CI / Release (push) Has been skipped
CI / Notify Downstream (chat-handler) (push) Has been skipped
CI / Notify Downstream (pipeline-bridge) (push) Has been skipped
CI / Notify Downstream (stt-module) (push) Has been skipped
CI / Notify Downstream (tts-module) (push) Has been skipped
CI / Notify Downstream (voice-assistant) (push) Has been skipped
CI / Notify (push) Successful in 2s
- Add postJSONStream() for incremental response body reading
- Add LLMClient.StreamGenerate() with SSE parsing and onToken callback
- Supports stream:true, parses data: lines, handles [DONE] sentinel
- Graceful partial-text return on stream interruption
- 9 new tests covering happy path, edge cases, cancellation
|
2026-02-20 17:55:01 -05:00 |
|
|
|
35912d5844
|
feat: add e2e tests, perf benchmarks, and infrastructure improvements
- messages/bench_test.go: serialization benchmarks (msgpack map vs struct vs protobuf)
- clients/clients_test.go: HTTP client tests with pooling verification (20 tests)
- natsutil/natsutil_test.go: encode/decode roundtrip + binary data tests
- handler/handler_test.go: handler dispatch tests + benchmark
- config/config.go: live reload via fsnotify + RWMutex getter methods
- clients/clients.go: SharedTransport + sync.Pool buffer pooling
- messages/messages.go: typed structs with msgpack+json tags
- messages/proto/: protobuf schema + generated code
Benchmark baseline (ChatRequest roundtrip):
MsgpackMap: 2949 ns/op, 36 allocs
MsgpackStruct: 2030 ns/op, 13 allocs (31% faster, 64% fewer allocs)
Protobuf: 793 ns/op, 8 allocs (73% faster, 78% fewer allocs)
|
2026-02-20 06:44:37 -05:00 |
|