Implements ADR-0024: Ray Repository Structure
- Ray Serve deployments for GPU-shared AI inference
- Published as PyPI package for dynamic code loading
- Deployments: LLM, embeddings, reranker, whisper, TTS
- CI/CD workflow publishes to Gitea PyPI on push to main
Extracted from kuberay-images repo per ADR-0024