# MLflow Utils MLflow integration utilities for the DaviesTechLabs AI/ML platform. ## Installation ```bash pip install -r requirements.txt ``` Or from Gitea: ```bash pip install git+https://git.daviestechlabs.io/daviestechlabs/mlflow.git ``` ## Modules | Module | Description | |--------|-------------| | `client.py` | MLflow client configuration and helpers | | `tracker.py` | General MLflowTracker for experiments | | `inference_tracker.py` | Async inference metrics for NATS handlers | | `model_registry.py` | Model Registry with KServe metadata | | `kfp_components.py` | Kubeflow Pipeline MLflow components | | `experiment_comparison.py` | Compare experiments and runs | | `cli.py` | Command-line interface | ## Quick Start ```python from mlflow_utils import get_mlflow_client, MLflowTracker # Simple tracking with MLflowTracker(experiment_name="my-experiment") as tracker: tracker.log_params({"learning_rate": 0.001}) tracker.log_metrics({"accuracy": 0.95}) ``` ## Inference Tracking For NATS handlers (chat-handler, voice-assistant): ```python from mlflow_utils import InferenceMetricsTracker from mlflow_utils.inference_tracker import InferenceMetrics tracker = InferenceMetricsTracker( experiment_name="voice-assistant-prod", batch_size=100, # Batch metrics before logging ) # During request handling metrics = InferenceMetrics( request_id="uuid", total_latency=1.5, llm_latency=0.8, input_tokens=150, output_tokens=200, ) await tracker.log_inference(metrics) ``` ## Model Registry Register models with KServe deployment metadata: ```python from mlflow_utils.model_registry import register_model_for_kserve register_model_for_kserve( model_name="my-qlora-adapter", model_uri="runs:/abc123/model", kserve_runtime="kserve-vllm", gpu_type="amd-strixhalo", ) ``` ## Kubeflow Components Use in KFP pipelines: ```python from mlflow_utils.kfp_components import ( log_experiment_component, register_model_component, ) ``` ## CLI ```bash # List experiments python -m mlflow_utils.cli list-experiments # Compare runs python -m mlflow_utils.cli compare-runs --experiment "qlora-training" # Export metrics python -m mlflow_utils.cli export --run-id abc123 --output metrics.json ``` ## Configuration | Environment Variable | Default | Description | |---------------------|---------|-------------| | `MLFLOW_TRACKING_URI` | `http://mlflow.mlflow.svc.cluster.local:80` | MLflow server | | `MLFLOW_EXPERIMENT_NAME` | `default` | Default experiment | | `MLFLOW_ENABLE_ASYNC` | `true` | Async logging for handlers | ## Module Structure ``` mlflow_utils/ ├── __init__.py # Public API ├── client.py # Connection management ├── tracker.py # General experiment tracker ├── inference_tracker.py # Async inference metrics ├── model_registry.py # Model registration + KServe ├── kfp_components.py # Kubeflow components ├── experiment_comparison.py # Run comparison tools └── cli.py # Command-line interface ``` ## Related - [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) - Uses inference tracker - [kubeflow](https://git.daviestechlabs.io/daviestechlabs/kubeflow) - KFP components - [argo](https://git.daviestechlabs.io/daviestechlabs/argo) - Training workflows - [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs