Files
homelab-design/decisions/0055-internal-python-package-publishing.md
Billy D. 35f17d6342
All checks were successful
Update README with ADR Index / update-readme (push) Successful in 6s
docs: add ADR-0054 Kubeflow Pipeline CI/CD and ADR-0055 Internal Python Package Publishing
2026-02-13 14:44:45 -05:00

5.4 KiB

Internal Python Package Publishing

  • Status: accepted
  • Date: 2026-02-13
  • Deciders: Billy
  • Technical Story: Publish reusable Python packages to Gitea's built-in PyPI registry with automated CI

Context and Problem Statement

Shared Python libraries like mlflow_utils are used across multiple projects (handler-base, Kubeflow pipelines, Argo workflows). Currently these are consumed via git dependencies or copy-paste. This is fragile — there's no versioning, no quality gate, and no single source of truth for installed versions.

How do we publish internal Python packages so they can be installed with pip install / uv add from a private registry, with automated quality checks and versioning?

Decision Drivers

  • Shared libraries are consumed by multiple services and pipelines
  • Need version pinning for reproducible builds
  • Quality gates (lint, format, test) should run before publishing
  • Must work with uv, pip, and KFP container images
  • Self-hosted — no PyPI.org or external registries
  • Consistent with existing CI patterns (ADR-0031, ADR-0015)

Considered Options

  1. Gitea's built-in PyPI registry with CI-driven publish
  2. Private PyPI server (pypiserver or devpi)
  3. Git-based dependencies (pip install git+https://...)
  4. Vendored copies in each consuming repository

Decision Outcome

Chosen option: Option 1 — Gitea's built-in PyPI registry, because Gitea already provides a packages API with PyPI compatibility, eliminating the need for another service. Combined with uv build and twine upload, the publish workflow is minimal.

Positive Consequences

  • Standard pip install mlflow-utils --index-url ... works everywhere
  • Semantic versioning with git tags provides clear release history
  • Lint + format + test gates prevent broken packages from publishing
  • No additional infrastructure — Gitea handles package storage
  • Consuming projects can pin exact versions

Negative Consequences

  • Registry credentials must be configured as CI secrets per repo
  • Gitea's PyPI registry is basic (no yanking, no project pages)
  • Version conflicts possible if consumers don't pin

Implementation

Package Structure

mlflow/
├── pyproject.toml          # hatchling build, ruff+pytest dev deps
├── uv.lock                 # Locked dependencies
├── mlflow_utils/
│   ├── __init__.py
│   ├── client.py
│   ├── tracker.py
│   ├── inference_tracker.py
│   ├── model_registry.py
│   ├── kfp_components.py
│   ├── experiment_comparison.py
│   └── cli.py              # CLI entrypoint: mlflow-utils
└── tests/
    └── test_smoke.py       # Import validation for all modules

CI Workflow

Four jobs in .gitea/workflows/ci.yaml:

Job Purpose Gate
lint ruff check + ruff format --check Must pass
test pytest -v Must pass
publish Build + upload to Gitea PyPI + tag After lint+test, main only
notify ntfy success/failure notification Always

Key Design Decisions

uv over pip for CI: All jobs use uv installed via curl -LsSf https://astral.sh/uv/install.sh | sh rather than the astral-sh/setup-uv GitHub Action, which is unavailable in Gitea's act runner. uv sync --frozen --extra dev ensures reproducible installs from the lockfile.

uvx twine for publishing: Rather than uv pip install twine --system (blocked by Debian's externally-managed environment), uvx twine upload runs twine in an ephemeral virtual environment.

Semantic versioning from commit messages: Same pattern as ADR-0015 — commit prefixes (major:, feat:, fix:) determine version bumps. The publish step patches pyproject.toml at build time via sed, builds with uv build, uploads with twine, then tags.

Registry Configuration

Setting Value
Registry URL http://gitea-http.gitea.svc.cluster.local:3000/api/packages/daviestechlabs/pypi
Auth REGISTRY_USER + REGISTRY_TOKEN repo secrets (Gitea admin credentials)
External URL https://git.daviestechlabs.io/api/packages/daviestechlabs/pypi/simple/

Consuming Packages

From any project or Dockerfile:

# uv
uv add mlflow-utils --index-url https://git.daviestechlabs.io/api/packages/daviestechlabs/pypi/simple/

# pip
pip install mlflow-utils --index-url https://git.daviestechlabs.io/api/packages/daviestechlabs/pypi/simple/

Quality Gates

Tool Check Config
ruff check Lint rules (F, E, W, I) line-length = 120 in pyproject.toml
ruff format Code formatting Consistent with check config
pytest Import smoke tests, unit tests kfp auto-skipped if not installed

Future Packages

This pattern applies to any shared Python library:

Candidate Repository Status
mlflow-utils mlflow Published
handler-base handler-base Candidate
ray-serve-apps ray-serve Candidate
  • Related to ADR-0012 (uv for Python)
  • Related to ADR-0015 (semantic versioning)
  • Related to ADR-0031 (Gitea CI/CD patterns)
  • Related to ADR-0047 (mlflow_utils library)
  • Updates ADR-0020 (internal registry — now includes PyPI)