All checks were successful
Update README with ADR Index / update-readme (push) Successful in 1m5s
All 5 handler services + companions-frontend migrated to handler-base v1.0.0 with protobuf wire format. golangci-lint clean across all repos.
3.3 KiB
3.3 KiB
Use MessagePack for NATS Messages
- Status: superseded by ADR-0061 (Protocol Buffers)
- Date: 2025-12-01
- Deciders: Billy Davies
- Technical Story: Selecting serialization format for NATS messages
Context and Problem Statement
NATS messages in the AI platform carry various payloads:
- Text chat messages (small)
- Voice audio data (potentially large, base64 or binary)
- Streaming response chunks
- Pipeline parameters
We need a serialization format that handles both text and binary efficiently.
Decision Drivers
- Efficient binary data handling (audio)
- Compact message size
- Fast serialization/deserialization
- Cross-language support (Python, Go)
- Debugging ability
- Schema flexibility
Considered Options
- JSON
- Protocol Buffers (protobuf)
- MessagePack (msgpack)
- CBOR
- Avro
Decision Outcome
Chosen option: "MessagePack (msgpack)", because it provides binary efficiency with JSON-like simplicity and schema-less flexibility.
Positive Consequences
- Native binary support (no base64 overhead for audio)
- 20-50% smaller than JSON for typical messages
- Faster serialization than JSON
- No schema compilation step
- Easy debugging (can pretty-print like JSON)
- Excellent Python and Go libraries
Negative Consequences
- Less human-readable than JSON when raw
- No built-in schema validation
- Slightly less common than JSON
Pros and Cons of the Options
JSON
- Good, because human-readable
- Good, because universal support
- Good, because no setup required
- Bad, because binary data requires base64 (33% overhead)
- Bad, because larger message sizes
- Bad, because slower parsing
Protocol Buffers
- Good, because very compact
- Good, because fast
- Good, because schema validation
- Good, because cross-language
- Bad, because requires schema definition
- Bad, because compilation step
- Bad, because less flexible for evolving schemas
- Bad, because overkill for simple messages
MessagePack
- Good, because binary-efficient
- Good, because JSON-like simplicity
- Good, because no schema required
- Good, because excellent library support
- Good, because can include raw bytes
- Bad, because not human-readable raw
- Bad, because no schema validation
CBOR
- Good, because binary-efficient
- Good, because IETF standard
- Good, because schema-less
- Bad, because less common libraries
- Bad, because smaller community
- Bad, because similar to msgpack with less adoption
Avro
- Good, because schema evolution
- Good, because compact
- Good, because schema registry integration
- Bad, because requires schema
- Bad, because more complex setup
- Bad, because Java-centric ecosystem
Implementation Notes
# Python usage
import msgpack
# Serialize
data = {
"user_id": "user-123",
"audio": audio_bytes, # Raw bytes, no base64
"premium": True
}
payload = msgpack.packb(data)
# Deserialize
data = msgpack.unpackb(payload, raw=False)
// Go usage
import "github.com/vmihailenco/msgpack/v5"
type Message struct {
UserID string `msgpack:"user_id"`
Audio []byte `msgpack:"audio"`
}
Links
- MessagePack Specification
- msgpack-python
- Related: ADR-0003 - Message bus choice
- See: BINARY_MESSAGES_AND_JETSTREAM.md