feat: add comprehensive architecture documentation

- Add AGENT-ONBOARDING.md for AI agents
- Add ARCHITECTURE.md with full system overview
- Add TECH-STACK.md with complete technology inventory
- Add DOMAIN-MODEL.md with entities and bounded contexts
- Add CODING-CONVENTIONS.md with patterns and practices
- Add GLOSSARY.md with terminology reference
- Add C4 diagrams (Context and Container levels)
- Add 10 ADRs documenting key decisions:
  - Talos Linux, NATS, MessagePack, Multi-GPU strategy
  - GitOps with Flux, KServe, Milvus, Dual workflow engines
  - Envoy Gateway
- Add specs directory with JetStream configuration
- Add diagrams for GPU allocation and data flows

Based on analysis of homelab-k8s2 and llm-workflows repositories
and kubectl cluster-info dump data.
This commit is contained in:
2026-02-01 14:30:05 -05:00
parent 4d4f6f464c
commit 832cda34bd
26 changed files with 3805 additions and 2 deletions

View File

@@ -0,0 +1,79 @@
# Record Architecture Decisions
* Status: accepted
* Date: 2025-11-30
* Deciders: Billy Davies
* Technical Story: Initial setup of homelab documentation
## Context and Problem Statement
As the homelab infrastructure grows in complexity with AI/ML services, multi-GPU configurations, and event-driven architectures, we need a way to document and communicate significant architectural decisions. Without documentation, the rationale behind choices gets lost, making future changes risky and onboarding difficult.
## Decision Drivers
* Need to preserve context for why decisions were made
* Enable future maintainers (including AI agents) to understand the system
* Provide a structured way to evaluate alternatives
* Support the wiki/design process for iterative improvements
## Considered Options
* Informal documentation in README files
* Wiki pages without structure
* Architecture Decision Records (ADRs)
* No documentation (rely on code)
## Decision Outcome
Chosen option: "Architecture Decision Records (ADRs)", because they provide a structured format that captures context, alternatives, and consequences. They're lightweight, version-controlled, and well-suited for technical decisions.
### Positive Consequences
* Clear historical record of decisions
* Structured format makes decisions searchable
* Forces consideration of alternatives
* Git-versioned alongside code
* AI agents can parse and understand decisions
### Negative Consequences
* Requires discipline to create ADRs
* May accumulate outdated decisions over time
* Additional overhead for simple decisions
## Pros and Cons of the Options
### Informal README documentation
* Good, because low friction
* Good, because close to code
* Bad, because no structure for alternatives
* Bad, because decisions get buried in prose
### Wiki pages
* Good, because easy to edit
* Good, because supports rich formatting
* Bad, because separate from code repository
* Bad, because no enforced structure
### ADRs
* Good, because structured format
* Good, because version controlled
* Good, because captures alternatives considered
* Good, because industry-standard practice
* Bad, because requires creating new files
* Bad, because may seem bureaucratic for small decisions
### No documentation
* Good, because no overhead
* Bad, because context is lost
* Bad, because makes onboarding difficult
* Bad, because risky for future changes
## Links
* Based on [MADR template](https://adr.github.io/madr/)
* [ADR GitHub organization](https://adr.github.io/)