Go to file

Build and Publish ray-serve-apps / build-and-publish (push) Failing after 14m11s

Details

ci: semver based on commit message keywords

- 'major' in message -> increment major, reset minor/patch
- 'minor' or 'feature' -> increment minor, reset patch
- 'bug', 'chore', anything else -> increment patch
- Release number from git rev-list commit count
- Format: major.minor.patch+release

2026-02-03 15:25:15 -05:00

.gitea/workflows

ci: semver based on commit message keywords

2026-02-03 15:25:15 -05:00

ray_serve

feat: initial ray-serve-apps PyPI package

2026-02-03 07:03:39 -05:00

LICENSE

Initial commit

2026-02-03 11:59:56 +00:00

pyproject.toml

feat: initial ray-serve-apps PyPI package

2026-02-03 07:03:39 -05:00

README.md

feat: initial ray-serve-apps PyPI package

2026-02-03 07:03:39 -05:00

requirements.txt

feat: initial ray-serve-apps PyPI package

2026-02-03 07:03:39 -05:00

README.md

ray-serve-apps

Ray Serve deployments for GPU-shared AI inference. Published as a PyPI package to enable dynamic code loading by Ray clusters.

Architecture

This repo contains application code only - no Docker images or Kubernetes manifests.

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  kuberay-images │     │   ray-serve     │     │  homelab-k8s2   │
│                 │     │                 │     │                 │
│  Docker images  │     │  PyPI package   │     │  K8s manifests  │
│  (GPU runtimes) │     │  (this repo)    │     │  (deployment)   │
└────────┬────────┘     └────────┬────────┘     └────────┬────────┘
         │                       │                       │
         ▼                       ▼                       ▼
   Container Registry      PyPI Registry           GitOps (Flux)

Deployments

Module	Purpose	Hardware Target
`serve_llm`	vLLM OpenAI-compatible API	Strix Halo (ROCm)
`serve_embeddings`	Sentence Transformers	Any GPU
`serve_reranker`	Cross-encoder reranking	Any GPU
`serve_whisper`	Faster Whisper STT	NVIDIA/Intel
`serve_tts`	Coqui TTS	Any GPU

Installation

# From Gitea PyPI
pip install ray-serve-apps --index-url https://git.daviestechlabs.io/api/packages/daviestechlabs/pypi/simple

# With optional dependencies
pip install ray-serve-apps[llm]        # vLLM support
pip install ray-serve-apps[embeddings] # Sentence Transformers
pip install ray-serve-apps[stt]        # Faster Whisper
pip install ray-serve-apps[tts]        # Coqui TTS

Usage

Ray clusters pull this package at runtime:

# In RayService spec
rayClusterConfig:
  headGroupSpec:
    template:
      spec:
        containers:
          - name: ray-head
            command:
              - /bin/bash
              - -c
              - |
                pip install ray-serve-apps==1.0.0 --index-url https://git.daviestechlabs.io/api/packages/daviestechlabs/pypi/simple
                ray start --head --dashboard-host=0.0.0.0
                serve run ray_serve.serve_llm:app

Development

# Install dev dependencies
pip install -e ".[dev]"

# Lint
ruff check .
ruff format .

# Test
pytest

Publishing

Pushes to main automatically publish to Gitea PyPI via CI/CD.

To bump version, edit pyproject.toml:

[project]
version = "1.1.0"