ray-serve

daviestechlabs/ray-serve

Fork 0

Commit Graph

Author	SHA1	Message	Date
Billy D.	15e4b8afa3	fix: make mlflow_logger import optional with no-op fallback All checks were successful Build and Publish ray-serve-apps / build-and-publish (push) Successful in 11s Details The strixhalo LLM worker uses py_executable pointing to the Docker image venv which doesn't have the updated ray-serve-apps package. Wrap all InferenceLogger imports in try/except and guard usage with None checks so apps degrade gracefully without MLflow logging.	2026-02-12 07:01:17 -05:00
Billy D.	7ec2107e0c	feat: add MLflow inference logging to all Ray Serve apps All checks were successful Build and Publish ray-serve-apps / build-and-publish (push) Successful in 16s Details - Add mlflow_logger.py: lightweight REST-based MLflow logger (no mlflow dep) - Instrument serve_llm.py with latency, token counts, tokens/sec metrics - Instrument serve_embeddings.py with latency, batch_size, total_tokens - Instrument serve_whisper.py with latency, audio_duration, realtime_factor - Instrument serve_tts.py with latency, audio_duration, text_chars - Instrument serve_reranker.py with latency, num_pairs, top_k	2026-02-12 06:14:30 -05:00
Billy D.	8ef914ec12	feat: initial ray-serve-apps PyPI package Some checks failed Build and Publish ray-serve-apps / lint (push) Failing after 11m2s Details Build and Publish ray-serve-apps / publish (push) Has been cancelled Details Implements ADR-0024: Ray Repository Structure - Ray Serve deployments for GPU-shared AI inference - Published as PyPI package for dynamic code loading - Deployments: LLM, embeddings, reranker, whisper, TTS - CI/CD workflow publishes to Gitea PyPI on push to main Extracted from kuberay-images repo per ADR-0024	2026-02-03 07:03:39 -05:00

Author

SHA1

Message

Date

Billy D.

15e4b8afa3

fix: make mlflow_logger import optional with no-op fallback

Build and Publish ray-serve-apps / build-and-publish (push) Successful in 11s

Details

The strixhalo LLM worker uses py_executable pointing to the Docker
image venv which doesn't have the updated ray-serve-apps package.
Wrap all InferenceLogger imports in try/except and guard usage with
None checks so apps degrade gracefully without MLflow logging.

2026-02-12 07:01:17 -05:00

Billy D.

7ec2107e0c

feat: add MLflow inference logging to all Ray Serve apps

Build and Publish ray-serve-apps / build-and-publish (push) Successful in 16s

Details

- Add mlflow_logger.py: lightweight REST-based MLflow logger (no mlflow dep)
- Instrument serve_llm.py with latency, token counts, tokens/sec metrics
- Instrument serve_embeddings.py with latency, batch_size, total_tokens
- Instrument serve_whisper.py with latency, audio_duration, realtime_factor
- Instrument serve_tts.py with latency, audio_duration, text_chars
- Instrument serve_reranker.py with latency, num_pairs, top_k

2026-02-12 06:14:30 -05:00

Billy D.

8ef914ec12

feat: initial ray-serve-apps PyPI package

Build and Publish ray-serve-apps / lint (push) Failing after 11m2s

Details

Build and Publish ray-serve-apps / publish (push) Has been cancelled

Details

Implements ADR-0024: Ray Repository Structure

- Ray Serve deployments for GPU-shared AI inference
- Published as PyPI package for dynamic code loading
- Deployments: LLM, embeddings, reranker, whisper, TTS
- CI/CD workflow publishes to Gitea PyPI on push to main

Extracted from kuberay-images repo per ADR-0024

2026-02-03 07:03:39 -05:00

3 Commits