tuning up runner improvements.

2026-02-06 07:53:31 -05:00
parent 80fb911e22
commit dd277f6459
2 changed files with 81 additions and 7 deletions
--- a/decisions/0014-docker-build-best-practices.md
+++ b/decisions/0014-docker-build-best-practices.md
@@ -110,8 +110,37 @@ RUN uv pip install --system \
 - Combine related commands into single RUN layers
 - Order from least to most frequently changing
 - Use multi-stage builds to reduce final image size
 - Use `COPY --link` for multi-stage `COPY --from` layers to make them independent
  of prior layers, improving cache reuse when base images change:
-### 9. .dockerignore
+```dockerfile
 # --link makes this layer reusable even if the base image changes
 COPY --link --from=rocm-source /opt/rocm /opt/rocm
 ```
 ### 9. Registry-Based BuildKit Cache
 Use `type=registry` cache instead of `type=gha` (which only works on GitHub Actions).
 This stores build cache layers directly in the container registry with zstd compression:
 ```yaml
 - name: Build and push
  uses: docker/build-push-action@v5
  with:
    cache-from: type=registry,ref=${{ env.REGISTRY }}/image:buildcache
    cache-to: type=registry,ref=${{ env.REGISTRY }}/image:buildcache,mode=max,image-manifest=true,compression=zstd
 ```
 Benefits:
 - Works on any CI system (Gitea Actions, Jenkins, etc.)
 - `mode=max` caches all layers, not just final image layers
 - `compression=zstd` is faster than gzip with similar compression ratios
 - Cache survives runner restarts (stored in registry, not ephemeral disk)
 **Important:** `type=gha` is a no-op on self-hosted Gitea runners — it requires
 GitHub's cache API. Always use `type=registry` for self-hosted CI.
 ### 10. .dockerignore
 All repos include a `.dockerignore`:
@@ -127,7 +156,7 @@ __pycache__/
 .ruff_cache/
 ```
-### 10. Makefile Integration
+### 11. Makefile Integration
 Standard targets for building and linting:
--- a/decisions/0031-gitea-cicd-strategy.md
+++ b/decisions/0031-gitea-cicd-strategy.md
@@ -286,13 +286,58 @@ on:
 See [kuberay-images/.gitea/workflows/build-push.yaml](https://git.daviestechlabs.io/daviestechlabs/kuberay-images/src/branch/main/.gitea/workflows/build-push.yaml) for complete example.
 ## Build Performance Tuning
 GPU worker images are 20-30GB+ due to ROCm/CUDA/PyTorch layers. Several optimizations
 are in place to avoid multi-hour rebuild/push cycles on every change.
 ### Registry-Based BuildKit Cache
 Use `type=registry` cache (not `type=gha`, which is a no-op on Gitea runners):
 ```yaml
 cache-from: type=registry,ref=${{ env.REGISTRY }}/image:buildcache
 cache-to: type=registry,ref=${{ env.REGISTRY }}/image:buildcache,mode=max,image-manifest=true,compression=zstd
 ```
 - `mode=max` caches all intermediate layers, not just the final image
 - `compression=zstd` is faster than gzip with comparable ratios
 - Cache is stored in the Gitea container registry alongside images
 - Only changed layers are rebuilt and pushed on subsequent builds
 ### Docker Daemon Tuning
 The runner's DinD daemon.json is configured for parallel transfers:
 ```json
 {
  "max-concurrent-uploads": 10,
  "max-concurrent-downloads": 10,
  "features": {
    "containerd-snapshotter": true
  }
 }
 ```
 Defaults are only 3 concurrent uploads — insufficient for images with many large layers.
 ### Persistent DinD Layer Cache
 The runner mounts a 100Gi Longhorn PVC at `/home/rootless/.local/share/docker` to
 persist Docker's layer cache across pod restarts. Without this, every runner restart
 forces re-download of 10-20GB base images (ROCm, Ray, PyTorch).
 | Volume | Storage Class | Size | Purpose |
 |--------|---------------|------|---------|
 | `gitea-runner-data` | nfs-slow | 5Gi | Runner state, workspace |
 | `gitea-runner-docker-cache` | longhorn | 100Gi | Docker layer cache |
 ## Future Enhancements
-1. **Caching improvements** - Persistent layer cache across builds
+1. **Multi-arch builds** - ARM64 support for Raspberry Pi
-2. **Multi-arch builds** - ARM64 support for Raspberry Pi
+2. **Security scanning** - Trivy integration in CI
-3. **Security scanning** - Trivy integration in CI
+3. **Signed images** - Cosign for image signatures
-4. **Signed images** - Cosign for image signatures
+4. **SLSA provenance** - Supply chain attestations
 5. **SLSA provenance** - Supply chain attestations
 ## References