Implements ADR-0024: Ray Repository Structure - Ray Serve deployments for GPU-shared AI inference - Published as PyPI package for dynamic code loading - Deployments: LLM, embeddings, reranker, whisper, TTS - CI/CD workflow publishes to Gitea PyPI on push to main Extracted from kuberay-images repo per ADR-0024
25 lines
306 B
Plaintext
25 lines
306 B
Plaintext
# Ray Serve dependencies
|
|
ray[serve]==2.53.0
|
|
|
|
# LLM inference
|
|
vllm
|
|
|
|
# Embeddings and reranking
|
|
sentence-transformers
|
|
|
|
# Speech-to-text
|
|
faster-whisper
|
|
|
|
# Text-to-speech
|
|
TTS
|
|
|
|
# HTTP client
|
|
httpx
|
|
|
|
# Numerical computing
|
|
numpy
|
|
scipy
|
|
|
|
# Optional: Intel GPU support (for danilo node)
|
|
# intel-extension-for-pytorch
|