- AGENT-ONBOARDING: New repo map with daviestechlabs Gitea repos - TECH-STACK: Reference handler-base instead of llm-workflows - CODING-CONVENTIONS: Update project structure for new repos - ADR 0006: Update GitRepository examples for Gitea repos llm-workflows has been split into: - handler-base, chat-handler, voice-assistant - kuberay-images, argo, kubeflow, mlflow, gradio-ui
8.9 KiB
8.9 KiB
📐 Coding Conventions
Patterns, practices, and folder structure conventions for DaviesTechLabs repositories
Repository Conventions
homelab-k8s2 (Infrastructure)
kubernetes/
├── apps/ # Application deployments
│ └── {namespace}/ # One folder per namespace
│ └── {app}/ # One folder per application
│ ├── app/ # Kubernetes manifests
│ │ ├── kustomization.yaml
│ │ ├── helmrelease.yaml # OR individual manifests
│ │ └── ...
│ └── ks.yaml # Flux Kustomization
├── components/ # Reusable Kustomize components
└── flux/ # Flux system configuration
Naming Conventions:
- Namespaces: lowercase with hyphens (
ai-ml,cert-manager) - Apps: lowercase with hyphens (
chat-handler,voice-assistant) - Secrets:
{app}-{type}(e.g.,milvus-credentials)
AI/ML Repos (git.daviestechlabs.io/daviestechlabs)
handler-base/ # Shared library for all handlers
├── handler_base/
│ ├── handler.py # Base Handler class
│ ├── nats_client.py # NATS wrapper
│ ├── config.py # Pydantic Settings
│ ├── health.py # K8s probes
│ ├── telemetry.py # OpenTelemetry
│ └── clients/ # Service clients
└── pyproject.toml
chat-handler/ # Text chat service
voice-assistant/ # Voice pipeline service
├── {name}.py # Standalone version
├── {name}_v2.py # Handler-base version (preferred)
└── Dockerfile.v2
argo/ # Argo WorkflowTemplates
├── {workflow-name}.yaml
kubeflow/ # Kubeflow Pipelines
├── {pipeline}_pipeline.py
kuberay-images/ # GPU worker images
├── dockerfiles/
└── ray-serve/
Python Conventions
Project Structure
# Use async/await for I/O
async def handle_message(msg: Msg) -> None:
...
# Use dataclasses for structured data
@dataclass
class ChatRequest:
user_id: str
message: str
enable_rag: bool = True
# Use msgpack for NATS messages
import msgpack
data = msgpack.packb({"key": "value"})
Naming
| Element | Convention | Example |
|---|---|---|
| Files | snake_case | chat_handler.py |
| Classes | PascalCase | ChatHandler |
| Functions | snake_case | process_message |
| Constants | UPPER_SNAKE | NATS_URL |
| Private | Leading underscore | _internal_method |
Type Hints
# Always use type hints
from typing import Optional, List, Dict, Any
async def query_rag(
query: str,
collection: str = "knowledge_base",
top_k: int = 5,
) -> List[Dict[str, Any]]:
...
Error Handling
# Use specific exceptions
class RAGQueryError(Exception):
"""Raised when RAG query fails."""
pass
# Log errors with context
import logging
logger = logging.getLogger(__name__)
try:
result = await milvus.search(...)
except Exception as e:
logger.error(f"RAG query failed: {e}", extra={"query": query})
raise RAGQueryError(f"Failed to query collection {collection}") from e
NATS Message Handling
import nats
import msgpack
async def message_handler(msg: Msg) -> None:
try:
# Decode MessagePack
data = msgpack.unpackb(msg.data, raw=False)
# Process
result = await process(data)
# Reply if request-reply pattern
if msg.reply:
await msg.respond(msgpack.packb(result))
# Acknowledge for JetStream
await msg.ack()
except Exception as e:
logger.error(f"Handler error: {e}")
# NAK for retry (JetStream)
await msg.nak()
Kubernetes Manifest Conventions
Labels
metadata:
labels:
# Required
app.kubernetes.io/name: chat-handler
app.kubernetes.io/instance: chat-handler
app.kubernetes.io/component: handler
app.kubernetes.io/part-of: ai-platform
# Optional
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: flux
Annotations
metadata:
annotations:
# Reloader for config changes
reloader.stakater.com/auto: "true"
# Documentation
description: "Handles chat messages via NATS"
Resource Requests
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
# GPU workloads
resources:
limits:
amd.com/gpu: 1 # AMD
nvidia.com/gpu: 1 # NVIDIA
Health Checks
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
Flux/GitOps Conventions
Kustomization Structure
# ks.yaml - Flux Kustomization
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: &app chat-handler
namespace: flux-system
spec:
targetNamespace: ai-ml
commonMetadata:
labels:
app.kubernetes.io/name: *app
path: ./kubernetes/apps/ai-ml/chat-handler/app
prune: true
sourceRef:
kind: GitRepository
name: flux-system
wait: true
interval: 30m
retryInterval: 1m
timeout: 5m
HelmRelease Structure
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: milvus
spec:
interval: 30m
chart:
spec:
chart: milvus
version: 4.x.x
sourceRef:
kind: HelmRepository
name: milvus
namespace: flux-system
values:
# Values here
Secret References
# Never hardcode secrets
env:
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
NATS Subject Conventions
Hierarchy
ai.{domain}.{scope}.{action}
Examples:
ai.chat.user.{userId}.message # User chat message
ai.chat.response.{requestId} # Chat response
ai.voice.user.{userId}.request # Voice request
ai.pipeline.trigger # Pipeline trigger
Wildcards
ai.chat.> # All chat events
ai.chat.user.*.message # All user messages
ai.*.response.{id} # Any response type
Git Conventions
Commit Messages
type(scope): subject
body (optional)
footer (optional)
Types:
feat: New featurefix: Bug fixdocs: Documentationstyle: Formattingrefactor: Code restructuringtest: Testschore: Maintenance
Examples:
feat(chat-handler): add streaming response support
fix(voice): handle empty audio gracefully
docs(adr): add decision for MessagePack format
Branch Naming
feature/short-description
fix/issue-number-description
docs/what-changed
Configuration Conventions
Environment Variables
# Use pydantic-settings or similar
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
nats_url: str = "nats://localhost:4222"
vllm_url: str = "http://localhost:8000"
milvus_host: str = "localhost"
milvus_port: int = 19530
log_level: str = "INFO"
class Config:
env_prefix = "" # No prefix
ConfigMaps
apiVersion: v1
kind: ConfigMap
metadata:
name: ai-services-config
data:
NATS_URL: "nats://nats.ai-ml.svc.cluster.local:4222"
VLLM_URL: "http://llm-draft.ai-ml.svc.cluster.local:8000/v1"
# ... other non-sensitive config
Documentation Conventions
ADR Format
See decisions/0000-template.md
Code Comments
# Use docstrings for public functions
async def query_rag(query: str) -> List[Dict]:
"""
Query the RAG system for relevant documents.
Args:
query: The search query string
Returns:
List of document chunks with scores
Raises:
RAGQueryError: If the query fails
"""
...
README Files
Each application should have a README with:
- Purpose
- Configuration
- Deployment
- Local development
- API documentation (if applicable)
Anti-Patterns to Avoid
| Don't | Do Instead |
|---|---|
kubectl apply directly |
Commit to Git, let Flux deploy |
| Hardcode secrets | Use External Secrets Operator |
Use latest image tags |
Pin to specific versions |
| Skip health checks | Always define liveness/readiness |
| Ignore resource limits | Set appropriate requests/limits |
| Use JSON for NATS messages | Use MessagePack (binary) |
| Synchronous I/O in handlers | Use async/await |
Related Documents
- TECH-STACK.md - Technologies used
- ARCHITECTURE.md - System design
- decisions/ - Why we made certain choices