From 730ab32b5d25619d7132f14235d86eec899d0950 Mon Sep 17 00:00:00 2001 From: "Billy D." Date: Mon, 2 Feb 2026 07:26:43 -0500 Subject: [PATCH] docs: add ADR-0014 for Docker build best practices Documents standardized Docker patterns: - BuildKit syntax 1.7 with cache mounts - uv for Python package installation (10-100x faster) - OCI Image Spec labels - HEALTHCHECK directives - Non-root execution - Version pinning with ranges Complements ADR-0012 (uv) and ADR-0013 (CI/CD) --- decisions/0014-docker-build-best-practices.md | 162 ++++++++++++++++++ 1 file changed, 162 insertions(+) create mode 100644 decisions/0014-docker-build-best-practices.md diff --git a/decisions/0014-docker-build-best-practices.md b/decisions/0014-docker-build-best-practices.md new file mode 100644 index 0000000..cbd0dcf --- /dev/null +++ b/decisions/0014-docker-build-best-practices.md @@ -0,0 +1,162 @@ +# ADR-0014: Docker Build Best Practices + +## Status + +Accepted + +## Date + +2026-02-02 + +## Context + +Our ML/AI platform relies heavily on containerized services, particularly GPU workers +for KubeRay that include large dependencies (PyTorch, vLLM, ROCm, CUDA). These images +can take 30+ minutes to build and exceed 10GB in size. We need standardized practices +to ensure: + +1. **Fast rebuilds** - Avoid re-downloading dependencies on every build +2. **Reproducibility** - Consistent builds across different machines +3. **Security** - Non-root execution, minimal attack surface +4. **Observability** - Proper metadata for image management +5. **Consistency** - Same patterns across all Dockerfiles + +## Decision + +We adopt the following Docker build best practices across all repositories: + +### 1. BuildKit Syntax and Features + +```dockerfile +# syntax=docker/dockerfile:1.7 +``` + +All Dockerfiles use BuildKit syntax 1.7+ for cache mount support. + +### 2. Use uv for Python Package Installation + +Replace pip with uv for dramatically faster installs (10-100x): + +```dockerfile +# Install uv +COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv + +# Install packages with cache mount +RUN --mount=type=cache,target=/root/.cache/uv \ + uv pip install --system --no-cache \ + 'package>=1.0,<2.0' +``` + +Benefits: +- Parallel downloads and installs +- Better dependency resolution +- Consistent with ADR-0012 (uv for Python development) + +### 3. Cache Mounts for Package Managers + +```dockerfile +# APT cache mount +RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \ + --mount=type=cache,target=/var/lib/apt,sharing=locked \ + apt-get update && apt-get install -y --no-install-recommends \ + package1 package2 + +# uv/pip cache mount +RUN --mount=type=cache,target=/home/ray/.cache/uv,uid=1000,gid=1000 \ + uv pip install --system 'package>=1.0' +``` + +### 4. OCI Image Specification Labels + +All images include standard metadata: + +```dockerfile +LABEL org.opencontainers.image.title="Service Name" +LABEL org.opencontainers.image.description="Service description" +LABEL org.opencontainers.image.vendor="DaviesTechLabs" +LABEL org.opencontainers.image.source="https://git.daviestechlabs.io/daviestechlabs/repo" +LABEL org.opencontainers.image.licenses="MIT" +``` + +### 5. Health Checks + +All service images include HEALTHCHECK: + +```dockerfile +HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \ + CMD curl -f http://localhost:8000/health || exit 1 +``` + +### 6. Non-Root Execution + +Services run as unprivileged users: + +```dockerfile +USER ray # or appuser, 1000:1000 +``` + +### 7. Version Pinning with Ranges + +Dependencies use minimum version with upper bound: + +```dockerfile +RUN uv pip install --system \ + 'transformers>=4.35.0,<5.0' \ + 'torch>=2.0.0,<3.0' +``` + +### 8. Layer Optimization + +- Combine related commands into single RUN layers +- Order from least to most frequently changing +- Use multi-stage builds to reduce final image size + +### 9. .dockerignore + +All repos include a `.dockerignore`: + +``` +.git +.gitea +*.md +__pycache__/ +*.pyc +.venv/ +.mypy_cache/ +.pytest_cache/ +.ruff_cache/ +``` + +### 10. Makefile Integration + +Standard targets for building and linting: + +```makefile +lint: + hadolint Dockerfile + +build: + docker buildx build --platform linux/amd64 --load -t image:tag . +``` + +## Consequences + +### Positive + +- **10-100x faster pip operations** with uv cache mounts +- **Consistent builds** via lockfiles and version pinning +- **Better observability** through OCI labels +- **Improved security** with non-root execution +- **Faster CI/CD** through BuildKit caching + +### Negative + +- **Requires Docker BuildKit** - Must use `DOCKER_BUILDKIT=1` or buildx +- **Cache invalidation complexity** - Cache mounts persist across builds +- **Learning curve** - Developers must understand BuildKit syntax + +## Related ADRs + +- [ADR-0011](0011-kuberay-unified-gpu-backend.md) - KubeRay GPU backend +- [ADR-0012](0012-use-uv-for-python-development.md) - uv for Python development +- [ADR-0013](0013-gitea-actions-for-ci.md) - Gitea Actions CI/CD