build: optimize Dockerfiles for production
Some checks failed
Some checks failed
- Use BuildKit syntax 1.7 with cache mounts for apt/uv - Switch from pip to uv for 10-100x faster installs (ADR-0014) - Add OCI Image Spec labels for container metadata - Add HEALTHCHECK directives for orchestration - Add .dockerignore to reduce context size - Update Makefile with buildx and lint target - Add retry logic to ray-entrypoint.sh Refs: ADR-0012 (uv), ADR-0014 (Docker best practices)
This commit is contained in:
28
README.md
28
README.md
@@ -2,18 +2,29 @@
|
||||
|
||||
GPU-specific Ray worker images for the DaviesTechLabs AI/ML platform.
|
||||
|
||||
## Features
|
||||
|
||||
- **BuildKit optimized**: Cache mounts for apt and pip speed up rebuilds
|
||||
- **OCI compliant**: Standard image labels (`org.opencontainers.image.*`)
|
||||
- **Health checks**: Built-in HEALTHCHECK for container orchestration
|
||||
- **Non-root execution**: Ray runs as unprivileged `ray` user
|
||||
- **Retry logic**: Entrypoint waits for Ray head with exponential backoff
|
||||
|
||||
## Images
|
||||
|
||||
| Image | GPU Target | Workloads | Registry |
|
||||
|-------|------------|-----------|----------|
|
||||
| `ray-worker-nvidia` | NVIDIA CUDA (RTX 2070) | Whisper STT, XTTS TTS | `git.daviestechlabs.io/daviestechlabs/ray-worker-nvidia` |
|
||||
| `ray-worker-rdna2` | AMD ROCm (Radeon 680M) | BGE Embeddings | `git.daviestechlabs.io/daviestechlabs/ray-worker-rdna2` |
|
||||
| `ray-worker-strixhalo` | AMD ROCm (Strix Halo) | vLLM, BGE | `git.daviestechlabs.io/daviestechlabs/ray-worker-strixhalo` |
|
||||
| `ray-worker-intel` | Intel XPU (Arc) | BGE Reranker | `git.daviestechlabs.io/daviestechlabs/ray-worker-intel` |
|
||||
| Image | GPU Target | Workloads | Base |
|
||||
|-------|------------|-----------|------|
|
||||
| `ray-worker-nvidia` | NVIDIA CUDA 12.1 (RTX 2070) | Whisper STT, XTTS TTS | `rayproject/ray-ml:2.53.0-py310-cu121` |
|
||||
| `ray-worker-rdna2` | AMD ROCm 6.4 (Radeon 680M) | BGE Embeddings | `rocm/pytorch:rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.6.0` |
|
||||
| `ray-worker-strixhalo` | AMD ROCm 7.1 (Strix Halo) | vLLM, BGE | `rocm/pytorch:rocm7.1_ubuntu24.04_py3.12_pytorch_release_2.8.0` |
|
||||
| `ray-worker-intel` | Intel XPU (Arc) | BGE Reranker | `rayproject/ray-ml:2.53.0-py310` |
|
||||
|
||||
## Building Locally
|
||||
|
||||
```bash
|
||||
# Lint Dockerfiles (requires hadolint)
|
||||
make lint
|
||||
|
||||
# Build all images
|
||||
make build-all
|
||||
|
||||
@@ -24,8 +35,11 @@ make build-strixhalo
|
||||
make build-intel
|
||||
|
||||
# Push to Gitea registry (requires login)
|
||||
docker login git.daviestechlabs.io
|
||||
make login
|
||||
make push-all
|
||||
|
||||
# Release with version tag
|
||||
make VERSION=v1.0.0 release
|
||||
```
|
||||
|
||||
## CI/CD
|
||||
|
||||
Reference in New Issue
Block a user