12 KiB
📐 Coding Conventions
Patterns, practices, and folder structure conventions for DaviesTechLabs repositories
Repository Conventions
homelab-k8s2 (Infrastructure)
kubernetes/
├── apps/ # Application deployments
│ └── {namespace}/ # One folder per namespace
│ └── {app}/ # One folder per application
│ ├── app/ # Kubernetes manifests
│ │ ├── kustomization.yaml
│ │ ├── helmrelease.yaml # OR individual manifests
│ │ └── ...
│ └── ks.yaml # Flux Kustomization
├── components/ # Reusable Kustomize components
└── flux/ # Flux system configuration
Naming Conventions:
- Namespaces: lowercase with hyphens (
ai-ml,cert-manager) - Apps: lowercase with hyphens (
chat-handler,voice-assistant) - Secrets:
{app}-{type}(e.g.,milvus-credentials)
AI/ML Repos (git.daviestechlabs.io/daviestechlabs)
handler-base/ # Shared Go module for all NATS handlers
├── clients/ # HTTP clients (LLM, STT, TTS, embeddings, reranker)
├── config/ # Env-based configuration (struct tags)
├── gen/messagespb/ # Generated protobuf stubs
├── handler/ # Typed NATS message handler with OTel + health wiring
├── health/ # HTTP health + readiness server
├── messages/ # Type aliases from generated protobuf stubs
├── natsutil/ # NATS publish/request with protobuf encoding
├── proto/messages/v1/ # .proto schema source
├── go.mod
└── buf.yaml # buf protobuf toolchain config
chat-handler/ # Text chat service (Go)
voice-assistant/ # Voice pipeline service (Go)
pipeline-bridge/ # Workflow engine bridge (Go)
stt-module/ # Speech-to-text bridge (Go)
tts-module/ # Text-to-speech bridge (Go)
├── main.go # Service entry point
├── main_test.go # Unit tests
├── e2e_test.go # End-to-end tests
├── go.mod # Go module (depends on handler-base)
├── Dockerfile # Distroless container (~20 MB)
└── renovate.json # Dependency update config
argo/ # Argo WorkflowTemplates
├── {workflow-name}.yaml
kubeflow/ # Kubeflow Pipelines
├── {pipeline}_pipeline.py
kuberay-images/ # GPU worker images
├── dockerfiles/
└── ray-serve/
Python Conventions
Package Management (ADR-0012)
Use uv for local development and pip in Docker for reproducibility:
# Install uv (one-time)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create virtual environment and install
uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"
# Or use uv sync with lock file
uv sync
# Update lock file after changing pyproject.toml
uv lock
# Run tests
uv run pytest
Code Formatting & Linting (Ruff)
All Python code must pass ruff check and ruff format before merge. Ruff is configured in each repo's pyproject.toml:
[tool.ruff]
line-length = 100
target-version = "py311"
[tool.ruff.lint]
select = ["E", "F", "W", "I", "UP", "B", "C4", "SIM"]
ignore = ["E501"] # Line length handled by formatter
[tool.ruff.format]
quote-style = "double"
Required dev dependency:
[project.optional-dependencies]
dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"pytest-cov>=4.0.0", # For coverage in handler-base
"ruff>=0.1.0",
]
Local workflow:
# Check and auto-fix
uv run ruff check --fix .
# Format code
uv run ruff format .
# Verify before commit
uv run ruff check . && uv run ruff format --check .
CI enforcement: All repos run ruff in the lint job. Commits that fail linting will not pass CI.
Kubeflow pipeline variables: For Kubeflow DSL pipelines, terminal task assignments that appear unused should have # noqa: F841 comments, as these define the DAG structure:
# Step 6: Final step (defines DAG dependency)
tts_task = synthesize_speech(text=llm_task.output) # noqa: F841
Project Structure
// Go handler services use handler-base shared module
import (
"git.daviestechlabs.io/daviestechlabs/handler-base/clients"
"git.daviestechlabs.io/daviestechlabs/handler-base/config"
"git.daviestechlabs.io/daviestechlabs/handler-base/handler"
"git.daviestechlabs.io/daviestechlabs/handler-base/health"
"git.daviestechlabs.io/daviestechlabs/handler-base/messages"
"git.daviestechlabs.io/daviestechlabs/handler-base/natsutil"
)
# Python remains for Ray Serve, Kubeflow pipelines, Gradio UIs
# Use async/await for I/O
async def handle_message(msg: Msg) -> None:
...
# Use dataclasses for structured data
@dataclass
class ChatRequest:
user_id: str
message: str
enable_rag: bool = True
Naming
| Element | Convention | Example |
|---|---|---|
| Files | snake_case | chat_handler.py |
| Classes | PascalCase | ChatHandler |
| Functions | snake_case | process_message |
| Constants | UPPER_SNAKE | NATS_URL |
| Private | Leading underscore | _internal_method |
Type Hints
# Always use type hints
from typing import Optional, List, Dict, Any
async def query_rag(
query: str,
collection: str = "knowledge_base",
top_k: int = 5,
) -> List[Dict[str, Any]]:
...
Error Handling
# Use specific exceptions
class RAGQueryError(Exception):
"""Raised when RAG query fails."""
pass
# Log errors with context
import logging
logger = logging.getLogger(__name__)
try:
result = await milvus.search(...)
except Exception as e:
logger.error(f"RAG query failed: {e}", extra={"query": query})
raise RAGQueryError(f"Failed to query collection {collection}") from e
NATS Message Handling
All NATS handler services use Go with Protocol Buffers encoding (see ADR-0061):
// Go NATS handler (production pattern)
func (h *Handler) handleMessage(msg *nats.Msg) {
var req messages.ChatRequest
if err := proto.Unmarshal(msg.Data, &req); err != nil {
h.logger.Error("failed to unmarshal", "error", err)
return
}
// Process
result, err := h.process(ctx, &req)
if err != nil {
h.logger.Error("handler error", "error", err)
msg.Nak()
return
}
// Reply if request-reply pattern
if msg.Reply != "" {
data, _ := proto.Marshal(result)
msg.Respond(data)
}
msg.Ack()
}
Python NATS is still used in Ray Serve
runtime_envand Kubeflow pipeline components where needed, but all dedicated NATS handler services are Go.
Kubernetes Manifest Conventions
Labels
metadata:
labels:
# Required
app.kubernetes.io/name: chat-handler
app.kubernetes.io/instance: chat-handler
app.kubernetes.io/component: handler
app.kubernetes.io/part-of: ai-platform
# Optional
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: flux
Annotations
metadata:
annotations:
# Reloader for config changes
reloader.stakater.com/auto: "true"
# Documentation
description: "Handles chat messages via NATS"
Resource Requests
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
# GPU workloads
resources:
limits:
amd.com/gpu: 1 # AMD
nvidia.com/gpu: 1 # NVIDIA
Health Checks
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
Flux/GitOps Conventions
Kustomization Structure
# ks.yaml - Flux Kustomization
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: &app chat-handler
namespace: flux-system
spec:
targetNamespace: ai-ml
commonMetadata:
labels:
app.kubernetes.io/name: *app
path: ./kubernetes/apps/ai-ml/chat-handler/app
prune: true
sourceRef:
kind: GitRepository
name: flux-system
wait: true
interval: 30m
retryInterval: 1m
timeout: 5m
HelmRelease Structure
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: milvus
spec:
interval: 30m
chart:
spec:
chart: milvus
version: 4.x.x
sourceRef:
kind: HelmRepository
name: milvus
namespace: flux-system
values:
# Values here
Secret References
# Never hardcode secrets
env:
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
NATS Subject Conventions
Hierarchy
ai.{domain}.{scope}.{action}
Examples:
ai.chat.user.{userId}.message # User chat message
ai.chat.response.{requestId} # Chat response
ai.voice.user.{userId}.request # Voice request
ai.pipeline.trigger # Pipeline trigger
Wildcards
ai.chat.> # All chat events
ai.chat.user.*.message # All user messages
ai.*.response.{id} # Any response type
Git Conventions
Commit Messages
type(scope): subject
body (optional)
footer (optional)
Types:
feat: New featurefix: Bug fixdocs: Documentationstyle: Formattingrefactor: Code restructuringtest: Testschore: Maintenance
Examples:
feat(chat-handler): add streaming response support
fix(voice): handle empty audio gracefully
docs(adr): add decision for MessagePack format
Branch Naming
feature/short-description
fix/issue-number-description
docs/what-changed
Configuration Conventions
Environment Variables
# Use pydantic-settings or similar
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
nats_url: str = "nats://localhost:4222"
vllm_url: str = "http://localhost:8000"
milvus_host: str = "localhost"
milvus_port: int = 19530
log_level: str = "INFO"
class Config:
env_prefix = "" # No prefix
ConfigMaps
apiVersion: v1
kind: ConfigMap
metadata:
name: ai-services-config
data:
NATS_URL: "nats://nats.ai-ml.svc.cluster.local:4222"
VLLM_URL: "http://llm-draft.ai-ml.svc.cluster.local:8000/v1"
# ... other non-sensitive config
Documentation Conventions
ADR Format
See decisions/0000-template.md
Code Comments
# Use docstrings for public functions
async def query_rag(query: str) -> List[Dict]:
"""
Query the RAG system for relevant documents.
Args:
query: The search query string
Returns:
List of document chunks with scores
Raises:
RAGQueryError: If the query fails
"""
...
README Files
Each application should have a README with:
- Purpose
- Configuration
- Deployment
- Local development
- API documentation (if applicable)
Anti-Patterns to Avoid
| Don't | Do Instead |
|---|---|
kubectl apply directly |
Commit to Git, let Flux deploy |
| Hardcode secrets | Use External Secrets Operator |
Use latest image tags |
Pin to specific versions |
| Skip health checks | Always define liveness/readiness |
| Ignore resource limits | Set appropriate requests/limits |
| Use JSON for NATS messages | Use Protocol Buffers (see ADR-0061) |
| Write handler services in Python | Use Go with handler-base module (ADR-0061) |
| Synchronous I/O in handlers | Use goroutines / async patterns |
Related Documents
- TECH-STACK.md - Technologies used
- ARCHITECTURE.md - System design
- decisions/ - Why we made certain choices