Some checks failed
- Use BuildKit syntax 1.7 with cache mounts for apt/uv - Switch from pip to uv for 10-100x faster installs (ADR-0014) - Add OCI Image Spec labels for container metadata - Add HEALTHCHECK directives for orchestration - Add .dockerignore to reduce context size - Update Makefile with buildx and lint target - Add retry logic to ray-entrypoint.sh Refs: ADR-0012 (uv), ADR-0014 (Docker best practices)
104 lines
3.2 KiB
Markdown
104 lines
3.2 KiB
Markdown
# KubeRay Worker Images
|
|
|
|
GPU-specific Ray worker images for the DaviesTechLabs AI/ML platform.
|
|
|
|
## Features
|
|
|
|
- **BuildKit optimized**: Cache mounts for apt and pip speed up rebuilds
|
|
- **OCI compliant**: Standard image labels (`org.opencontainers.image.*`)
|
|
- **Health checks**: Built-in HEALTHCHECK for container orchestration
|
|
- **Non-root execution**: Ray runs as unprivileged `ray` user
|
|
- **Retry logic**: Entrypoint waits for Ray head with exponential backoff
|
|
|
|
## Images
|
|
|
|
| Image | GPU Target | Workloads | Base |
|
|
|-------|------------|-----------|------|
|
|
| `ray-worker-nvidia` | NVIDIA CUDA 12.1 (RTX 2070) | Whisper STT, XTTS TTS | `rayproject/ray-ml:2.53.0-py310-cu121` |
|
|
| `ray-worker-rdna2` | AMD ROCm 6.4 (Radeon 680M) | BGE Embeddings | `rocm/pytorch:rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.6.0` |
|
|
| `ray-worker-strixhalo` | AMD ROCm 7.1 (Strix Halo) | vLLM, BGE | `rocm/pytorch:rocm7.1_ubuntu24.04_py3.12_pytorch_release_2.8.0` |
|
|
| `ray-worker-intel` | Intel XPU (Arc) | BGE Reranker | `rayproject/ray-ml:2.53.0-py310` |
|
|
|
|
## Building Locally
|
|
|
|
```bash
|
|
# Lint Dockerfiles (requires hadolint)
|
|
make lint
|
|
|
|
# Build all images
|
|
make build-all
|
|
|
|
# Build specific image
|
|
make build-nvidia
|
|
make build-rdna2
|
|
make build-strixhalo
|
|
make build-intel
|
|
|
|
# Push to Gitea registry (requires login)
|
|
make login
|
|
make push-all
|
|
|
|
# Release with version tag
|
|
make VERSION=v1.0.0 release
|
|
```
|
|
|
|
## CI/CD
|
|
|
|
Images are automatically built and pushed to `git.daviestechlabs.io` package registry on:
|
|
- Push to `main` branch
|
|
- Git tag creation (e.g., `v1.0.0`)
|
|
|
|
### Gitea Actions Secrets Required
|
|
|
|
Add these secrets in Gitea repo settings → Actions → Secrets:
|
|
|
|
| Secret | Description |
|
|
|--------|-------------|
|
|
| `REGISTRY_USER` | Gitea username |
|
|
| `REGISTRY_TOKEN` | Gitea access token with `package:write` scope |
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
kuberay-images/
|
|
├── dockerfiles/
|
|
│ ├── Dockerfile.ray-worker-nvidia
|
|
│ ├── Dockerfile.ray-worker-rdna2
|
|
│ ├── Dockerfile.ray-worker-strixhalo
|
|
│ ├── Dockerfile.ray-worker-intel
|
|
│ └── ray-entrypoint.sh
|
|
├── ray-serve/
|
|
│ ├── serve_embeddings.py
|
|
│ ├── serve_whisper.py
|
|
│ ├── serve_tts.py
|
|
│ ├── serve_llm.py
|
|
│ ├── serve_reranker.py
|
|
│ └── requirements.txt
|
|
├── .gitea/workflows/
|
|
│ └── build-push.yaml
|
|
├── Makefile
|
|
└── README.md
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `RAY_HEAD_SVC` | Ray head service name | `ai-inference-raycluster-head-svc` |
|
|
| `GPU_RESOURCE` | Custom Ray resource name | `gpu_nvidia`, `gpu_amd`, etc. |
|
|
| `NUM_GPUS` | Number of GPUs to expose | `1` |
|
|
|
|
## Node Allocation
|
|
|
|
| Node | Image | GPU | Memory |
|
|
|------|-------|-----|--------|
|
|
| elminster | ray-worker-nvidia | RTX 2070 | 8GB VRAM |
|
|
| khelben | ray-worker-strixhalo | Strix Halo | 64GB Unified |
|
|
| drizzt | ray-worker-rdna2 | Radeon 680M | 12GB VRAM |
|
|
| danilo | ray-worker-intel | Intel Arc | 16GB Shared |
|
|
|
|
## Related
|
|
|
|
- [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture documentation
|
|
- [homelab-k8s2](https://github.com/Billy-Davies-2/homelab-k8s2) - Kubernetes manifests
|