feat: add pipeline bridge for NATS to Argo/Kubeflow

- pipeline_bridge.py: Standalone bridge service
- pipeline_bridge_v2.py: handler-base version
- Supports Argo Workflows and Kubeflow Pipelines
- Workflow monitoring and status publishing
- Dockerfile variants for standalone and handler-base
This commit is contained in:
2026-02-02 06:23:21 -05:00
parent 57514f2b09
commit 50b1835688
7 changed files with 756 additions and 1 deletions

110
README.md
View File

@@ -1,2 +1,110 @@
# pipeline-bridge
# Pipeline Bridge
Bridges NATS events to Kubeflow Pipelines and Argo Workflows.
## Overview
The Pipeline Bridge listens for pipeline trigger requests on NATS and submits them to the appropriate workflow engine (Argo Workflows or Kubeflow Pipelines). It monitors execution and publishes status updates back to NATS.
## NATS Subjects
| Subject | Direction | Description |
|---------|-----------|-------------|
| `ai.pipeline.trigger` | Subscribe | Pipeline trigger requests |
| `ai.pipeline.status.{request_id}` | Publish | Pipeline status updates |
## Supported Pipelines
| Pipeline | Engine | Description |
|----------|--------|-------------|
| `document-ingestion` | Argo | Ingest documents into Milvus |
| `batch-inference` | Argo | Run batch LLM inference |
| `model-evaluation` | Argo | Evaluate model performance |
| `rag-query` | Kubeflow | Execute RAG query pipeline |
| `voice-pipeline` | Kubeflow | Full voice assistant pipeline |
## Request Format
```json
{
"request_id": "uuid",
"pipeline": "document-ingestion",
"parameters": {
"source-url": "s3://bucket/docs/",
"collection-name": "knowledge_base"
}
}
```
## Response Format
```json
{
"request_id": "uuid",
"status": "submitted",
"pipeline": "document-ingestion",
"engine": "argo",
"run_id": "document-ingestion-abc123",
"message": "Pipeline submitted successfully",
"timestamp": "2026-01-03T12:00:00Z"
}
```
## Status Updates
The bridge publishes status updates as the workflow progresses:
- `submitted` - Workflow created
- `pending` - Waiting to start
- `running` - In progress
- `succeeded` - Completed successfully
- `failed` - Failed
- `error` - System error
## Variants
### pipeline_bridge.py (Standalone)
Self-contained service with pip install on startup. Good for simple deployments.
### pipeline_bridge_v2.py (handler-base)
Uses handler-base library for standardized NATS handling, telemetry, and health checks.
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `NATS_URL` | `nats://nats.ai-ml.svc.cluster.local:4222` | NATS server URL |
| `KUBEFLOW_HOST` | `http://ml-pipeline.kubeflow.svc.cluster.local:8888` | Kubeflow Pipelines API |
| `ARGO_HOST` | `http://argo-server.argo.svc.cluster.local:2746` | Argo Workflows API |
| `ARGO_NAMESPACE` | `ai-ml` | Namespace for Argo Workflows |
## Building
```bash
# Standalone version
docker build -t pipeline-bridge:latest .
# handler-base version
docker build -f Dockerfile.v2 -t pipeline-bridge:v2 --build-arg BASE_TAG=latest .
```
## Testing
```bash
# Port-forward NATS
kubectl port-forward -n ai-ml svc/nats 4222:4222
# Trigger document ingestion
nats pub ai.pipeline.trigger '{
"request_id": "test-1",
"pipeline": "document-ingestion",
"parameters": {"source-url": "https://example.com/docs.txt"}
}'
# Monitor status
nats sub "ai.pipeline.status.>"
```
## License
MIT