0fb325fa0525d7efac9b16c5f10ee4eb63d83a8d
All checks were successful
Build and Publish ray-serve-apps / build-and-publish (push) Successful in 2m5s
- Add FastAPI ingress to TTSDeployment with two routes: POST / — JSON API with base64 audio (backward compat) GET /api/tts?text=&language_id= — raw WAV bytes (zero overhead) - GET /speakers endpoint for speaker listing - Properly uses _fastapi naming to avoid collision with Ray binding - app = TTSDeployment.bind() for rayservice.yaml compatibility
ray-serve-apps
Ray Serve deployments for GPU-shared AI inference. Published as a PyPI package to enable dynamic code loading by Ray clusters.
Architecture
This repo contains application code only - no Docker images or Kubernetes manifests.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ kuberay-images │ │ ray-serve │ │ homelab-k8s2 │
│ │ │ │ │ │
│ Docker images │ │ PyPI package │ │ K8s manifests │
│ (GPU runtimes) │ │ (this repo) │ │ (deployment) │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
▼ ▼ ▼
Container Registry PyPI Registry GitOps (Flux)
Deployments
| Module | Purpose | Hardware Target |
|---|---|---|
serve_llm |
vLLM OpenAI-compatible API | Strix Halo (ROCm) |
serve_embeddings |
Sentence Transformers | Any GPU |
serve_reranker |
Cross-encoder reranking | Any GPU |
serve_whisper |
Faster Whisper STT | NVIDIA/Intel |
serve_tts |
Coqui TTS | Any GPU |
Installation
# From Gitea PyPI
pip install ray-serve-apps --index-url https://git.daviestechlabs.io/api/packages/daviestechlabs/pypi/simple
# With optional dependencies
pip install ray-serve-apps[llm] # vLLM support
pip install ray-serve-apps[embeddings] # Sentence Transformers
pip install ray-serve-apps[stt] # Faster Whisper
pip install ray-serve-apps[tts] # Coqui TTS
Usage
Ray clusters pull this package at runtime:
# In RayService spec
rayClusterConfig:
headGroupSpec:
template:
spec:
containers:
- name: ray-head
command:
- /bin/bash
- -c
- |
pip install ray-serve-apps==1.0.0 --index-url https://git.daviestechlabs.io/api/packages/daviestechlabs/pypi/simple
ray start --head --dashboard-host=0.0.0.0
serve run ray_serve.serve_llm:app
Development
# Install dev dependencies
pip install -e ".[dev]"
# Lint
ruff check .
ruff format .
# Test
pytest
Publishing
Pushes to main automatically publish to Gitea PyPI via CI/CD.
To bump version, edit pyproject.toml:
[project]
version = "1.1.0"
Languages
Python
100%