# Argo Workflows ML training and batch inference workflows for the DaviesTechLabs AI/ML platform. ## Workflows | Workflow | Description | Trigger | |----------|-------------|---------| | `batch-inference` | Run LLM inference on batch inputs | `ai.pipeline.trigger` (pipeline="batch-inference") | | `qlora-training` | Train QLoRA adapters from Milvus data | `ai.pipeline.trigger` (pipeline="qlora-training") | | `hybrid-ml-training` | Multi-GPU distributed training | `ai.pipeline.trigger` (pipeline="hybrid-ml-training") | | `coqui-voice-training` | XTTS voice cloning/training | `ai.pipeline.trigger` (pipeline="coqui-voice-training") | | `document-ingestion` | Ingest documents into Milvus | `ai.pipeline.trigger` (pipeline="document-ingestion") | ## Integration | File | Description | |------|-------------| | `eventsource-kfp.yaml` | Argo Events source for Kubeflow Pipelines integration | | `kfp-integration.yaml` | Bridge workflows between Argo and Kubeflow | ## Architecture ``` NATS (ai.pipeline.trigger) │ ▼ ┌─────────────────┐ │ Argo Events │ │ EventSource │ └─────────────────┘ │ ▼ ┌─────────────────┐ │ Argo Sensor │ └─────────────────┘ │ ▼ ┌─────────────────┐ │ WorkflowTemplate│ │ (batch-inf, │ │ qlora, etc) │ └─────────────────┘ │ ├──▶ GPU Pods (AMD ROCm / NVIDIA CUDA) ├──▶ Milvus Vector DB ├──▶ vLLM / Ray Serve └──▶ MLflow Tracking ``` ## Workflow Details ### batch-inference Batch LLM inference with optional RAG: ```bash argo submit batch-inference.yaml \ -p input-url="s3://bucket/inputs.json" \ -p output-url="s3://bucket/outputs.json" \ -p use-rag="true" \ -p max-tokens="500" ``` ### qlora-training Fine-tune QLoRA adapters from Milvus knowledge: ```bash argo submit qlora-training.yaml \ -p reference-model="mistralai/Mistral-7B-Instruct-v0.3" \ -p output-name="my-adapter" \ -p milvus-collections="docs,wiki" \ -p num-epochs="3" ``` ### coqui-voice-training Train XTTS voice models: ```bash argo submit coqui-voice-training.yaml \ -p voice-name="my-voice" \ -p audio-samples-url="s3://bucket/samples/" ``` ### document-ingestion Ingest documents into Milvus: ```bash argo submit document-ingestion.yaml \ -p source-url="s3://bucket/docs/" \ -p collection="knowledge_base" \ -p chunk-size="512" ``` ## NATS Trigger Format Workflows are triggered via NATS `ai.pipeline.trigger`: ```json { "pipeline": "qlora-training", "parameters": { "reference-model": "mistralai/Mistral-7B-Instruct-v0.3", "output-name": "custom-adapter", "num-epochs": "5" } } ``` ## GPU Scheduling Workflows use node affinity for GPU allocation: | Node | GPU | Best For | |------|-----|----------| | khelben | AMD Strix Halo 64GB | Large model training, vLLM | | elminster | NVIDIA RTX 2070 | Whisper, XTTS | | drizzt | AMD Radeon 680M | Embeddings | | danilo | Intel Arc | Reranker | ## Related - [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs - [kuberay-images](https://git.daviestechlabs.io/daviestechlabs/kuberay-images) - Ray worker images - [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) - Handler library