docs: add ADR-0014 for Docker build best practices
Documents standardized Docker patterns: - BuildKit syntax 1.7 with cache mounts - uv for Python package installation (10-100x faster) - OCI Image Spec labels - HEALTHCHECK directives - Non-root execution - Version pinning with ranges Complements ADR-0012 (uv) and ADR-0013 (CI/CD)
This commit is contained in:
162
decisions/0014-docker-build-best-practices.md
Normal file
162
decisions/0014-docker-build-best-practices.md
Normal file
@@ -0,0 +1,162 @@
|
|||||||
|
# ADR-0014: Docker Build Best Practices
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Date
|
||||||
|
|
||||||
|
2026-02-02
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
Our ML/AI platform relies heavily on containerized services, particularly GPU workers
|
||||||
|
for KubeRay that include large dependencies (PyTorch, vLLM, ROCm, CUDA). These images
|
||||||
|
can take 30+ minutes to build and exceed 10GB in size. We need standardized practices
|
||||||
|
to ensure:
|
||||||
|
|
||||||
|
1. **Fast rebuilds** - Avoid re-downloading dependencies on every build
|
||||||
|
2. **Reproducibility** - Consistent builds across different machines
|
||||||
|
3. **Security** - Non-root execution, minimal attack surface
|
||||||
|
4. **Observability** - Proper metadata for image management
|
||||||
|
5. **Consistency** - Same patterns across all Dockerfiles
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We adopt the following Docker build best practices across all repositories:
|
||||||
|
|
||||||
|
### 1. BuildKit Syntax and Features
|
||||||
|
|
||||||
|
```dockerfile
|
||||||
|
# syntax=docker/dockerfile:1.7
|
||||||
|
```
|
||||||
|
|
||||||
|
All Dockerfiles use BuildKit syntax 1.7+ for cache mount support.
|
||||||
|
|
||||||
|
### 2. Use uv for Python Package Installation
|
||||||
|
|
||||||
|
Replace pip with uv for dramatically faster installs (10-100x):
|
||||||
|
|
||||||
|
```dockerfile
|
||||||
|
# Install uv
|
||||||
|
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
|
||||||
|
|
||||||
|
# Install packages with cache mount
|
||||||
|
RUN --mount=type=cache,target=/root/.cache/uv \
|
||||||
|
uv pip install --system --no-cache \
|
||||||
|
'package>=1.0,<2.0'
|
||||||
|
```
|
||||||
|
|
||||||
|
Benefits:
|
||||||
|
- Parallel downloads and installs
|
||||||
|
- Better dependency resolution
|
||||||
|
- Consistent with ADR-0012 (uv for Python development)
|
||||||
|
|
||||||
|
### 3. Cache Mounts for Package Managers
|
||||||
|
|
||||||
|
```dockerfile
|
||||||
|
# APT cache mount
|
||||||
|
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
|
||||||
|
--mount=type=cache,target=/var/lib/apt,sharing=locked \
|
||||||
|
apt-get update && apt-get install -y --no-install-recommends \
|
||||||
|
package1 package2
|
||||||
|
|
||||||
|
# uv/pip cache mount
|
||||||
|
RUN --mount=type=cache,target=/home/ray/.cache/uv,uid=1000,gid=1000 \
|
||||||
|
uv pip install --system 'package>=1.0'
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. OCI Image Specification Labels
|
||||||
|
|
||||||
|
All images include standard metadata:
|
||||||
|
|
||||||
|
```dockerfile
|
||||||
|
LABEL org.opencontainers.image.title="Service Name"
|
||||||
|
LABEL org.opencontainers.image.description="Service description"
|
||||||
|
LABEL org.opencontainers.image.vendor="DaviesTechLabs"
|
||||||
|
LABEL org.opencontainers.image.source="https://git.daviestechlabs.io/daviestechlabs/repo"
|
||||||
|
LABEL org.opencontainers.image.licenses="MIT"
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Health Checks
|
||||||
|
|
||||||
|
All service images include HEALTHCHECK:
|
||||||
|
|
||||||
|
```dockerfile
|
||||||
|
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
|
||||||
|
CMD curl -f http://localhost:8000/health || exit 1
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6. Non-Root Execution
|
||||||
|
|
||||||
|
Services run as unprivileged users:
|
||||||
|
|
||||||
|
```dockerfile
|
||||||
|
USER ray # or appuser, 1000:1000
|
||||||
|
```
|
||||||
|
|
||||||
|
### 7. Version Pinning with Ranges
|
||||||
|
|
||||||
|
Dependencies use minimum version with upper bound:
|
||||||
|
|
||||||
|
```dockerfile
|
||||||
|
RUN uv pip install --system \
|
||||||
|
'transformers>=4.35.0,<5.0' \
|
||||||
|
'torch>=2.0.0,<3.0'
|
||||||
|
```
|
||||||
|
|
||||||
|
### 8. Layer Optimization
|
||||||
|
|
||||||
|
- Combine related commands into single RUN layers
|
||||||
|
- Order from least to most frequently changing
|
||||||
|
- Use multi-stage builds to reduce final image size
|
||||||
|
|
||||||
|
### 9. .dockerignore
|
||||||
|
|
||||||
|
All repos include a `.dockerignore`:
|
||||||
|
|
||||||
|
```
|
||||||
|
.git
|
||||||
|
.gitea
|
||||||
|
*.md
|
||||||
|
__pycache__/
|
||||||
|
*.pyc
|
||||||
|
.venv/
|
||||||
|
.mypy_cache/
|
||||||
|
.pytest_cache/
|
||||||
|
.ruff_cache/
|
||||||
|
```
|
||||||
|
|
||||||
|
### 10. Makefile Integration
|
||||||
|
|
||||||
|
Standard targets for building and linting:
|
||||||
|
|
||||||
|
```makefile
|
||||||
|
lint:
|
||||||
|
hadolint Dockerfile
|
||||||
|
|
||||||
|
build:
|
||||||
|
docker buildx build --platform linux/amd64 --load -t image:tag .
|
||||||
|
```
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
|
||||||
|
- **10-100x faster pip operations** with uv cache mounts
|
||||||
|
- **Consistent builds** via lockfiles and version pinning
|
||||||
|
- **Better observability** through OCI labels
|
||||||
|
- **Improved security** with non-root execution
|
||||||
|
- **Faster CI/CD** through BuildKit caching
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
|
||||||
|
- **Requires Docker BuildKit** - Must use `DOCKER_BUILDKIT=1` or buildx
|
||||||
|
- **Cache invalidation complexity** - Cache mounts persist across builds
|
||||||
|
- **Learning curve** - Developers must understand BuildKit syntax
|
||||||
|
|
||||||
|
## Related ADRs
|
||||||
|
|
||||||
|
- [ADR-0011](0011-kuberay-unified-gpu-backend.md) - KubeRay GPU backend
|
||||||
|
- [ADR-0012](0012-use-uv-for-python-development.md) - uv for Python development
|
||||||
|
- [ADR-0013](0013-gitea-actions-for-ci.md) - Gitea Actions CI/CD
|
||||||
Reference in New Issue
Block a user