Billy D. baf86e5609
Some checks failed
Build and Publish ray-serve-apps / build-and-publish (push) Failing after 14m11s
ci: semver based on commit message keywords
- 'major' in message -> increment major, reset minor/patch
- 'minor' or 'feature' -> increment minor, reset patch
- 'bug', 'chore', anything else -> increment patch
- Release number from git rev-list commit count
- Format: major.minor.patch+release
2026-02-03 15:25:15 -05:00
2026-02-03 11:59:56 +00:00

ray-serve-apps

Ray Serve deployments for GPU-shared AI inference. Published as a PyPI package to enable dynamic code loading by Ray clusters.

Architecture

This repo contains application code only - no Docker images or Kubernetes manifests.

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  kuberay-images │     │   ray-serve     │     │  homelab-k8s2   │
│                 │     │                 │     │                 │
│  Docker images  │     │  PyPI package   │     │  K8s manifests  │
│  (GPU runtimes) │     │  (this repo)    │     │  (deployment)   │
└────────┬────────┘     └────────┬────────┘     └────────┬────────┘
         │                       │                       │
         ▼                       ▼                       ▼
   Container Registry      PyPI Registry           GitOps (Flux)

Deployments

Module Purpose Hardware Target
serve_llm vLLM OpenAI-compatible API Strix Halo (ROCm)
serve_embeddings Sentence Transformers Any GPU
serve_reranker Cross-encoder reranking Any GPU
serve_whisper Faster Whisper STT NVIDIA/Intel
serve_tts Coqui TTS Any GPU

Installation

# From Gitea PyPI
pip install ray-serve-apps --index-url https://git.daviestechlabs.io/api/packages/daviestechlabs/pypi/simple

# With optional dependencies
pip install ray-serve-apps[llm]        # vLLM support
pip install ray-serve-apps[embeddings] # Sentence Transformers
pip install ray-serve-apps[stt]        # Faster Whisper
pip install ray-serve-apps[tts]        # Coqui TTS

Usage

Ray clusters pull this package at runtime:

# In RayService spec
rayClusterConfig:
  headGroupSpec:
    template:
      spec:
        containers:
          - name: ray-head
            command:
              - /bin/bash
              - -c
              - |
                pip install ray-serve-apps==1.0.0 --index-url https://git.daviestechlabs.io/api/packages/daviestechlabs/pypi/simple
                ray start --head --dashboard-host=0.0.0.0
                serve run ray_serve.serve_llm:app

Development

# Install dev dependencies
pip install -e ".[dev]"

# Lint
ruff check .
ruff format .

# Test
pytest

Publishing

Pushes to main automatically publish to Gitea PyPI via CI/CD.

To bump version, edit pyproject.toml:

[project]
version = "1.1.0"
Description
No description provided
Readme MIT 112 KiB
Languages
Python 100%