kuberay-images/README.md

# KubeRay Worker Images

GPU-specific Ray worker images for the DaviesTechLabs AI/ML platform.

## Images

| Image | GPU Target | Workloads | Registry |
|-------|------------|-----------|----------|
| `ray-worker-nvidia` | NVIDIA CUDA (RTX 2070) | Whisper STT, XTTS TTS | `git.daviestechlabs.io/daviestechlabs/ray-worker-nvidia` |
| `ray-worker-rdna2` | AMD ROCm (Radeon 680M) | BGE Embeddings | `git.daviestechlabs.io/daviestechlabs/ray-worker-rdna2` |
| `ray-worker-strixhalo` | AMD ROCm (Strix Halo) | vLLM, BGE | `git.daviestechlabs.io/daviestechlabs/ray-worker-strixhalo` |
| `ray-worker-intel` | Intel XPU (Arc) | BGE Reranker | `git.daviestechlabs.io/daviestechlabs/ray-worker-intel` |

## Building Locally

```bash
# Build all images
make build-all

# Build specific image
make build-nvidia
make build-rdna2
make build-strixhalo
make build-intel

# Push to Gitea registry (requires login)
docker login git.daviestechlabs.io
make push-all
```

## CI/CD

Images are automatically built and pushed to `git.daviestechlabs.io` package registry on:
- Push to `main` branch
- Git tag creation (e.g., `v1.0.0`)

### Gitea Actions Secrets Required

Add these secrets in Gitea repo settings → Actions → Secrets:

| Secret | Description |
|--------|-------------|
| `REGISTRY_USER` | Gitea username |
| `REGISTRY_TOKEN` | Gitea access token with `package:write` scope |

## Directory Structure

```
kuberay-images/
├── dockerfiles/
│   ├── Dockerfile.ray-worker-nvidia
│   ├── Dockerfile.ray-worker-rdna2
│   ├── Dockerfile.ray-worker-strixhalo
│   ├── Dockerfile.ray-worker-intel
│   └── ray-entrypoint.sh
├── ray-serve/
│   ├── serve_embeddings.py
│   ├── serve_whisper.py
│   ├── serve_tts.py
│   ├── serve_llm.py
│   ├── serve_reranker.py
│   └── requirements.txt
├── .gitea/workflows/
│   └── build-push.yaml
├── Makefile
└── README.md
```

## Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `RAY_HEAD_SVC` | Ray head service name | `ai-inference-raycluster-head-svc` |
| `GPU_RESOURCE` | Custom Ray resource name | `gpu_nvidia`, `gpu_amd`, etc. |
| `NUM_GPUS` | Number of GPUs to expose | `1` |

## Node Allocation

| Node | Image | GPU | Memory |
|------|-------|-----|--------|
| elminster | ray-worker-nvidia | RTX 2070 | 8GB VRAM |
| khelben | ray-worker-strixhalo | Strix Halo | 64GB Unified |
| drizzt | ray-worker-rdna2 | Radeon 680M | 12GB VRAM |
| danilo | ray-worker-intel | Intel Arc | 16GB Shared |

## Related

- [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture documentation
- [homelab-k8s2](https://github.com/Billy-Davies-2/homelab-k8s2) - Kubernetes manifests