# Use MessagePack for NATS Messages * Status: accepted * Date: 2025-12-01 * Deciders: Billy Davies * Technical Story: Selecting serialization format for NATS messages ## Context and Problem Statement NATS messages in the AI platform carry various payloads: - Text chat messages (small) - Voice audio data (potentially large, base64 or binary) - Streaming response chunks - Pipeline parameters We need a serialization format that handles both text and binary efficiently. ## Decision Drivers * Efficient binary data handling (audio) * Compact message size * Fast serialization/deserialization * Cross-language support (Python, Go) * Debugging ability * Schema flexibility ## Considered Options * JSON * Protocol Buffers (protobuf) * MessagePack (msgpack) * CBOR * Avro ## Decision Outcome Chosen option: "MessagePack (msgpack)", because it provides binary efficiency with JSON-like simplicity and schema-less flexibility. ### Positive Consequences * Native binary support (no base64 overhead for audio) * 20-50% smaller than JSON for typical messages * Faster serialization than JSON * No schema compilation step * Easy debugging (can pretty-print like JSON) * Excellent Python and Go libraries ### Negative Consequences * Less human-readable than JSON when raw * No built-in schema validation * Slightly less common than JSON ## Pros and Cons of the Options ### JSON * Good, because human-readable * Good, because universal support * Good, because no setup required * Bad, because binary data requires base64 (33% overhead) * Bad, because larger message sizes * Bad, because slower parsing ### Protocol Buffers * Good, because very compact * Good, because fast * Good, because schema validation * Good, because cross-language * Bad, because requires schema definition * Bad, because compilation step * Bad, because less flexible for evolving schemas * Bad, because overkill for simple messages ### MessagePack * Good, because binary-efficient * Good, because JSON-like simplicity * Good, because no schema required * Good, because excellent library support * Good, because can include raw bytes * Bad, because not human-readable raw * Bad, because no schema validation ### CBOR * Good, because binary-efficient * Good, because IETF standard * Good, because schema-less * Bad, because less common libraries * Bad, because smaller community * Bad, because similar to msgpack with less adoption ### Avro * Good, because schema evolution * Good, because compact * Good, because schema registry integration * Bad, because requires schema * Bad, because more complex setup * Bad, because Java-centric ecosystem ## Implementation Notes ```python # Python usage import msgpack # Serialize data = { "user_id": "user-123", "audio": audio_bytes, # Raw bytes, no base64 "premium": True } payload = msgpack.packb(data) # Deserialize data = msgpack.unpackb(payload, raw=False) ``` ```go // Go usage import "github.com/vmihailenco/msgpack/v5" type Message struct { UserID string `msgpack:"user_id"` Audio []byte `msgpack:"audio"` } ``` ## Links * [MessagePack Specification](https://msgpack.org) * [msgpack-python](https://github.com/msgpack/msgpack-python) * Related: [ADR-0003](0003-use-nats-for-messaging.md) - Message bus choice * See: [BINARY_MESSAGES_AND_JETSTREAM.md](../specs/BINARY_MESSAGES_AND_JETSTREAM.md)