- stt.yaml: rename WHISPER_URL to STT_URL to match what stt.py reads
- tts.py: improve WAV handling (BytesIO fix), sentence splitting, robust
_read_wav_bytes with wave+soundfile+raw-PCM fallbacks
- Add __pycache__/ to .gitignore
Each UI now logs per-request metrics to MLflow:
- llm.py: latency, tokens/sec, prompt/completion tokens (gradio-llm-tuning)
- embeddings.py: latency, text length, batch size (gradio-embeddings-tuning)
- stt.py: latency, audio duration, real-time factor (gradio-stt-tuning)
- tts.py: latency, text length, audio duration (gradio-tts-tuning)
Uses try/except guarded imports so UIs still work if MLflow is
unreachable. Persistent run per Gradio instance, batched metric logging
via MlflowClient.log_batch().