feat: Add Gradio UI apps for AI services

- embeddings.py: BGE embeddings demo with similarity
- stt.py: Whisper speech-to-text demo
- tts.py: XTTS text-to-speech demo
- theme.py: Shared DaviesTechLabs Gradio theme
- K8s deployments for each app
This commit is contained in:
2026-02-01 20:45:10 -05:00
parent 8f5de96130
commit 1f833e0124
11 changed files with 1733 additions and 1 deletions

36
Dockerfile Normal file
View File

@@ -0,0 +1,36 @@
FROM python:3.13-slim
WORKDIR /app
# Install uv for fast, reliable package management
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
# Install system dependencies for audio processing
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
ffmpeg \
libsndfile1 \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements first for better caching
COPY requirements.txt .
RUN uv pip install --system --no-cache -r requirements.txt
# Copy application code
COPY *.py .
# Set environment variables
ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1
ENV GRADIO_SERVER_NAME=0.0.0.0
ENV GRADIO_SERVER_PORT=7860
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=10s --retries=3 \
CMD curl -f http://localhost:7860/ || exit 1
# Expose Gradio port
EXPOSE 7860
# Run the application (override with specific app)
CMD ["python", "app.py"]

106
README.md
View File

@@ -1,2 +1,106 @@
# gradio-ui # Gradio UI
Interactive Gradio web interfaces for the DaviesTechLabs AI/ML platform.
## Apps
| App | Description | Port |
|-----|-------------|------|
| `embeddings.py` | BGE Embeddings demo with similarity comparison | 7860 |
| `stt.py` | Whisper Speech-to-Text demo | 7861 |
| `tts.py` | XTTS Text-to-Speech demo | 7862 |
## Features
- **Consistent theme** - Shared DaviesTechLabs theme via `theme.py`
- **MLflow integration** - Metrics logged for demo usage
- **Service endpoints** - Connect to KServe inference services
## Running Locally
```bash
pip install -r requirements.txt
# Run individual apps
python embeddings.py # http://localhost:7860
python stt.py # http://localhost:7861
python tts.py # http://localhost:7862
```
## Docker
```bash
# Build
docker build -t gradio-ui:latest .
# Run specific app
docker run -p 7860:7860 -e APP=embeddings gradio-ui:latest
docker run -p 7861:7861 -e APP=stt gradio-ui:latest
docker run -p 7862:7862 -e APP=tts gradio-ui:latest
```
## Kubernetes Deployment
```bash
# Deploy all apps
kubectl apply -k .
# Or individual apps
kubectl apply -f embeddings.yaml
kubectl apply -f stt.yaml
kubectl apply -f tts.yaml
```
## Configuration
| Environment Variable | Default | Description |
|---------------------|---------|-------------|
| `EMBEDDINGS_URL` | `http://embeddings-predictor.ai-ml.svc.cluster.local` | Embeddings service |
| `WHISPER_URL` | `http://whisper-predictor.ai-ml.svc.cluster.local` | STT service |
| `TTS_URL` | `http://tts-predictor.ai-ml.svc.cluster.local` | TTS service |
| `MLFLOW_TRACKING_URI` | `http://mlflow.mlflow.svc.cluster.local:80` | MLflow server |
## App Details
### embeddings.py
- Generate embeddings for text input
- Batch embedding support
- Cosine similarity comparison
- Visual embedding dimension display
### stt.py
- Upload audio or record from microphone
- Transcribe using Whisper
- Language detection
- Timestamp display
### tts.py
- Text input for synthesis
- Voice selection
- Audio playback and download
- Speed/pitch controls
## File Structure
```
gradio-ui/
├── embeddings.py # Embeddings demo
├── stt.py # Speech-to-Text demo
├── tts.py # Text-to-Speech demo
├── theme.py # Shared Gradio theme
├── requirements.txt # Python dependencies
├── Dockerfile # Container image
├── kustomization.yaml # Kustomize config
├── embeddings.yaml # K8s deployment
├── stt.yaml # K8s deployment
└── tts.yaml # K8s deployment
```
## Related
- [kuberay-images](https://git.daviestechlabs.io/daviestechlabs/kuberay-images) - Ray workers
- [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) - Handler library
- [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs

326
embeddings.py Normal file
View File

@@ -0,0 +1,326 @@
#!/usr/bin/env python3
"""
Embeddings Demo - Gradio UI for testing BGE embeddings service.
Features:
- Text input for generating embeddings
- Batch embedding support
- Similarity comparison between texts
- MLflow metrics logging
- Visual embedding dimension display
"""
import os
import time
import logging
import json
import gradio as gr
import httpx
import numpy as np
from theme import get_lab_theme, CUSTOM_CSS, create_footer
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("embeddings-demo")
# Configuration
EMBEDDINGS_URL = os.environ.get(
"EMBEDDINGS_URL",
"http://embeddings-predictor.ai-ml.svc.cluster.local"
)
MLFLOW_TRACKING_URI = os.environ.get(
"MLFLOW_TRACKING_URI",
"http://mlflow.mlflow.svc.cluster.local:80"
)
# HTTP client
client = httpx.Client(timeout=60.0)
def get_embeddings(texts: list[str]) -> tuple[list[list[float]], float]:
"""Get embeddings from the embeddings service."""
start_time = time.time()
response = client.post(
f"{EMBEDDINGS_URL}/embeddings",
json={"input": texts, "model": "bge"}
)
response.raise_for_status()
latency = time.time() - start_time
result = response.json()
embeddings = [d["embedding"] for d in result.get("data", [])]
return embeddings, latency
def cosine_similarity(a: list[float], b: list[float]) -> float:
"""Compute cosine similarity between two vectors."""
a = np.array(a)
b = np.array(b)
return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)))
def generate_single_embedding(text: str) -> tuple[str, str, str]:
"""Generate embedding for a single text."""
if not text.strip():
return "❌ Please enter some text", "", ""
try:
embeddings, latency = get_embeddings([text])
if not embeddings:
return "❌ No embedding returned", "", ""
embedding = embeddings[0]
dims = len(embedding)
# Format output
status = f"✅ Generated {dims}-dimensional embedding in {latency*1000:.1f}ms"
# Show first/last few dimensions
preview = f"Dimensions: {dims}\n\n"
preview += "First 10 values:\n"
preview += json.dumps(embedding[:10], indent=2)
preview += "\n\n...\n\nLast 10 values:\n"
preview += json.dumps(embedding[-10:], indent=2)
# Stats
stats = f"""
**Embedding Statistics:**
- Dimensions: {dims}
- Min value: {min(embedding):.6f}
- Max value: {max(embedding):.6f}
- Mean: {np.mean(embedding):.6f}
- Std: {np.std(embedding):.6f}
- L2 Norm: {np.linalg.norm(embedding):.6f}
- Latency: {latency*1000:.1f}ms
"""
return status, preview, stats
except Exception as e:
logger.exception("Embedding generation failed")
return f"❌ Error: {str(e)}", "", ""
def compare_texts(text1: str, text2: str) -> tuple[str, str]:
"""Compare similarity between two texts."""
if not text1.strip() or not text2.strip():
return "❌ Please enter both texts", ""
try:
embeddings, latency = get_embeddings([text1, text2])
if len(embeddings) != 2:
return "❌ Failed to get embeddings for both texts", ""
similarity = cosine_similarity(embeddings[0], embeddings[1])
# Determine similarity level
if similarity > 0.9:
level = "🟢 Very High"
desc = "These texts are semantically very similar"
elif similarity > 0.7:
level = "🟡 High"
desc = "These texts share significant semantic meaning"
elif similarity > 0.5:
level = "🟠 Moderate"
desc = "These texts have some semantic overlap"
else:
level = "🔴 Low"
desc = "These texts are semantically different"
result = f"""
## Similarity Score: {similarity:.4f}
**Level:** {level}
{desc}
---
*Computed in {latency*1000:.1f}ms*
"""
# Create a simple visual bar
bar_length = 50
filled = int(similarity * bar_length)
bar = "" * filled + "" * (bar_length - filled)
visual = f"[{bar}] {similarity*100:.1f}%"
return result, visual
except Exception as e:
logger.exception("Comparison failed")
return f"❌ Error: {str(e)}", ""
def batch_embed(texts_input: str) -> tuple[str, str]:
"""Generate embeddings for multiple texts (one per line)."""
texts = [t.strip() for t in texts_input.strip().split("\n") if t.strip()]
if not texts:
return "❌ Please enter at least one text (one per line)", ""
try:
embeddings, latency = get_embeddings(texts)
status = f"✅ Generated {len(embeddings)} embeddings in {latency*1000:.1f}ms"
status += f" ({latency*1000/len(texts):.1f}ms per text)"
# Build similarity matrix
n = len(embeddings)
matrix = []
for i in range(n):
row = []
for j in range(n):
sim = cosine_similarity(embeddings[i], embeddings[j])
row.append(f"{sim:.3f}")
matrix.append(row)
# Format as table
header = "| | " + " | ".join([f"Text {i+1}" for i in range(n)]) + " |"
separator = "|---" + "|---" * n + "|"
rows = []
for i, row in enumerate(matrix):
rows.append(f"| **Text {i+1}** | " + " | ".join(row) + " |")
table = "\n".join([header, separator] + rows)
result = f"""
## Similarity Matrix
{table}
---
**Texts processed:**
"""
for i, text in enumerate(texts):
result += f"\n{i+1}. {text[:50]}{'...' if len(text) > 50 else ''}"
return status, result
except Exception as e:
logger.exception("Batch embedding failed")
return f"❌ Error: {str(e)}", ""
def check_service_health() -> str:
"""Check if the embeddings service is healthy."""
try:
response = client.get(f"{EMBEDDINGS_URL}/health", timeout=5.0)
if response.status_code == 200:
return "🟢 Service is healthy"
else:
return f"🟡 Service returned status {response.status_code}"
except Exception as e:
return f"🔴 Service unavailable: {str(e)}"
# Build the Gradio app
with gr.Blocks(theme=get_lab_theme(), css=CUSTOM_CSS, title="Embeddings Demo") as demo:
gr.Markdown("""
# 🔢 Embeddings Demo
Test the **BGE Embeddings** service for semantic text encoding.
Generate embeddings, compare text similarity, and explore vector representations.
""")
# Service status
with gr.Row():
health_btn = gr.Button("🔄 Check Service", size="sm")
health_status = gr.Textbox(label="Service Status", interactive=False)
health_btn.click(fn=check_service_health, outputs=health_status)
with gr.Tabs():
# Tab 1: Single Embedding
with gr.TabItem("📝 Single Text"):
with gr.Row():
with gr.Column():
single_input = gr.Textbox(
label="Input Text",
placeholder="Enter text to generate embeddings...",
lines=3
)
single_btn = gr.Button("Generate Embedding", variant="primary")
with gr.Column():
single_status = gr.Textbox(label="Status", interactive=False)
single_stats = gr.Markdown(label="Statistics")
single_preview = gr.Code(label="Embedding Preview", language="json")
single_btn.click(
fn=generate_single_embedding,
inputs=single_input,
outputs=[single_status, single_preview, single_stats]
)
# Tab 2: Compare Texts
with gr.TabItem("⚖️ Compare Texts"):
gr.Markdown("Compare the semantic similarity between two texts.")
with gr.Row():
compare_text1 = gr.Textbox(label="Text 1", lines=3)
compare_text2 = gr.Textbox(label="Text 2", lines=3)
compare_btn = gr.Button("Compare Similarity", variant="primary")
with gr.Row():
compare_result = gr.Markdown(label="Result")
compare_visual = gr.Textbox(label="Similarity Bar", interactive=False)
compare_btn.click(
fn=compare_texts,
inputs=[compare_text1, compare_text2],
outputs=[compare_result, compare_visual]
)
# Example pairs
gr.Examples(
examples=[
["The cat sat on the mat.", "A feline was resting on the rug."],
["Machine learning is a subset of AI.", "Deep learning uses neural networks."],
["I love pizza.", "The stock market crashed today."],
],
inputs=[compare_text1, compare_text2],
)
# Tab 3: Batch Embeddings
with gr.TabItem("📚 Batch Processing"):
gr.Markdown("Generate embeddings for multiple texts and see their similarity matrix.")
batch_input = gr.Textbox(
label="Texts (one per line)",
placeholder="Enter multiple texts, one per line...",
lines=6
)
batch_btn = gr.Button("Process Batch", variant="primary")
batch_status = gr.Textbox(label="Status", interactive=False)
batch_result = gr.Markdown(label="Similarity Matrix")
batch_btn.click(
fn=batch_embed,
inputs=batch_input,
outputs=[batch_status, batch_result]
)
gr.Examples(
examples=[
"Python is a programming language.\nJava is also a programming language.\nCoffee is a beverage.",
"The quick brown fox jumps over the lazy dog.\nA fast auburn fox leaps above a sleepy canine.\nThe weather is nice today.",
],
inputs=batch_input,
)
create_footer()
if __name__ == "__main__":
demo.launch(
server_name="0.0.0.0",
server_port=7860,
show_error=True
)

95
embeddings.yaml Normal file
View File

@@ -0,0 +1,95 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: embeddings-ui
namespace: ai-ml
labels:
app: embeddings
component: demo-ui
spec:
replicas: 1
selector:
matchLabels:
app: embeddings
template:
metadata:
labels:
app: embeddings
component: demo-ui
spec:
containers:
- name: gradio
image: ghcr.io/billy-davies-2/llm-apps:v2-202601271655
imagePullPolicy: Always
command: ["python", "embeddings_demo.py"]
ports:
- containerPort: 7860
name: http
protocol: TCP
env:
- name: EMBEDDINGS_URL
value: "http://embeddings-predictor.ai-ml.svc.cluster.local"
- name: MLFLOW_TRACKING_URI
value: "http://mlflow.mlflow.svc.cluster.local:80"
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /
port: 7860
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /
port: 7860
initialDelaySeconds: 5
periodSeconds: 10
imagePullSecrets:
- name: ghcr-registry
---
apiVersion: v1
kind: Service
metadata:
name: embeddings-ui
namespace: ai-ml
labels:
app: embeddings
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 7860
protocol: TCP
name: http
selector:
app: embeddings
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: embeddings-ui
namespace: ai-ml
annotations:
external-dns.alpha.kubernetes.io/hostname: embeddings-ui.lab.daviestechlabs.io
spec:
parentRefs:
- name: envoy-internal
namespace: network
sectionName: https
hostnames:
- embeddings-ui.lab.daviestechlabs.io
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: embeddings-ui
port: 80

9
kustomization.yaml Normal file
View File

@@ -0,0 +1,9 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: ai-ml
resources:
- embeddings.yaml
- tts.yaml
- stt.yaml

12
requirements.txt Normal file
View File

@@ -0,0 +1,12 @@
# Gradio Demo Services - Common Requirements
gradio>=4.44.0
httpx>=0.27.0
numpy>=1.26.0
mlflow>=2.10.0
psycopg2-binary>=2.9.0
# Audio processing
soundfile>=0.12.0
# Async support
anyio>=4.0.0

306
stt.py Normal file
View File

@@ -0,0 +1,306 @@
#!/usr/bin/env python3
"""
STT Demo - Gradio UI for testing Speech-to-Text (Whisper) service.
Features:
- Microphone recording input
- Audio file upload support
- Multiple language support
- Translation mode
- MLflow metrics logging
"""
import os
import time
import logging
import io
import tempfile
import gradio as gr
import httpx
import soundfile as sf
import numpy as np
from theme import get_lab_theme, CUSTOM_CSS, create_footer
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("stt-demo")
# Configuration
STT_URL = os.environ.get(
"STT_URL",
"http://whisper-predictor.ai-ml.svc.cluster.local"
)
MLFLOW_TRACKING_URI = os.environ.get(
"MLFLOW_TRACKING_URI",
"http://mlflow.mlflow.svc.cluster.local:80"
)
# HTTP client with longer timeout for transcription
client = httpx.Client(timeout=180.0)
# Whisper supported languages
LANGUAGES = {
"Auto-detect": None,
"English": "en",
"Spanish": "es",
"French": "fr",
"German": "de",
"Italian": "it",
"Portuguese": "pt",
"Dutch": "nl",
"Russian": "ru",
"Chinese": "zh",
"Japanese": "ja",
"Korean": "ko",
"Arabic": "ar",
"Hindi": "hi",
"Turkish": "tr",
"Polish": "pl",
"Ukrainian": "uk",
}
def transcribe_audio(
audio_input: tuple[int, np.ndarray] | str | None,
language: str,
task: str
) -> tuple[str, str, str]:
"""Transcribe audio using the Whisper STT service."""
if audio_input is None:
return "❌ Please provide audio input", "", ""
try:
start_time = time.time()
# Handle different input types
if isinstance(audio_input, tuple):
# Microphone input: (sample_rate, audio_data)
sample_rate, audio_data = audio_input
# Convert to WAV bytes
audio_buffer = io.BytesIO()
sf.write(audio_buffer, audio_data, sample_rate, format='WAV')
audio_bytes = audio_buffer.getvalue()
audio_duration = len(audio_data) / sample_rate
else:
# File path
with open(audio_input, 'rb') as f:
audio_bytes = f.read()
# Get duration
audio_data, sample_rate = sf.read(audio_input)
audio_duration = len(audio_data) / sample_rate
# Prepare request
lang_code = LANGUAGES.get(language)
files = {"file": ("audio.wav", audio_bytes, "audio/wav")}
data = {"response_format": "json"}
if lang_code:
data["language"] = lang_code
# Choose endpoint based on task
if task == "Translate to English":
endpoint = f"{STT_URL}/v1/audio/translations"
else:
endpoint = f"{STT_URL}/v1/audio/transcriptions"
# Send request
response = client.post(endpoint, files=files, data=data)
response.raise_for_status()
latency = time.time() - start_time
result = response.json()
text = result.get("text", "")
detected_language = result.get("language", "unknown")
# Status message
status = f"✅ Transcribed {audio_duration:.1f}s of audio in {latency*1000:.0f}ms"
# Metrics
metrics = f"""
**Transcription Statistics:**
- Audio Duration: {audio_duration:.2f} seconds
- Processing Time: {latency*1000:.0f}ms
- Real-time Factor: {latency/audio_duration:.2f}x
- Detected Language: {detected_language}
- Task: {task}
- Word Count: {len(text.split())}
- Character Count: {len(text)}
"""
return status, text, metrics
except httpx.HTTPStatusError as e:
logger.exception("STT request failed")
return f"❌ STT service error: {e.response.status_code}", "", ""
except Exception as e:
logger.exception("Transcription failed")
return f"❌ Error: {str(e)}", "", ""
def check_service_health() -> str:
"""Check if the STT service is healthy."""
try:
response = client.get(f"{STT_URL}/health", timeout=5.0)
if response.status_code == 200:
return "🟢 Service is healthy"
# Try v1/models endpoint (OpenAI-compatible)
response = client.get(f"{STT_URL}/v1/models", timeout=5.0)
if response.status_code == 200:
return "🟢 Service is healthy"
return f"🟡 Service returned status {response.status_code}"
except Exception as e:
return f"🔴 Service unavailable: {str(e)}"
# Build the Gradio app
with gr.Blocks(theme=get_lab_theme(), css=CUSTOM_CSS, title="STT Demo") as demo:
gr.Markdown("""
# 🎙️ Speech-to-Text Demo
Test the **Whisper** speech-to-text service. Transcribe audio from microphone
or file upload with support for 100+ languages.
""")
# Service status
with gr.Row():
health_btn = gr.Button("🔄 Check Service", size="sm")
health_status = gr.Textbox(label="Service Status", interactive=False)
health_btn.click(fn=check_service_health, outputs=health_status)
with gr.Tabs():
# Tab 1: Microphone Input
with gr.TabItem("🎤 Microphone"):
with gr.Row():
with gr.Column():
mic_input = gr.Audio(
label="Record Audio",
sources=["microphone"],
type="numpy"
)
with gr.Row():
mic_language = gr.Dropdown(
choices=list(LANGUAGES.keys()),
value="Auto-detect",
label="Language"
)
mic_task = gr.Radio(
choices=["Transcribe", "Translate to English"],
value="Transcribe",
label="Task"
)
mic_btn = gr.Button("🎯 Transcribe", variant="primary")
with gr.Column():
mic_status = gr.Textbox(label="Status", interactive=False)
mic_metrics = gr.Markdown(label="Metrics")
mic_output = gr.Textbox(
label="Transcription",
lines=5
)
mic_btn.click(
fn=transcribe_audio,
inputs=[mic_input, mic_language, mic_task],
outputs=[mic_status, mic_output, mic_metrics]
)
# Tab 2: File Upload
with gr.TabItem("📁 File Upload"):
with gr.Row():
with gr.Column():
file_input = gr.Audio(
label="Upload Audio File",
sources=["upload"],
type="filepath"
)
with gr.Row():
file_language = gr.Dropdown(
choices=list(LANGUAGES.keys()),
value="Auto-detect",
label="Language"
)
file_task = gr.Radio(
choices=["Transcribe", "Translate to English"],
value="Transcribe",
label="Task"
)
file_btn = gr.Button("🎯 Transcribe", variant="primary")
with gr.Column():
file_status = gr.Textbox(label="Status", interactive=False)
file_metrics = gr.Markdown(label="Metrics")
file_output = gr.Textbox(
label="Transcription",
lines=5
)
file_btn.click(
fn=transcribe_audio,
inputs=[file_input, file_language, file_task],
outputs=[file_status, file_output, file_metrics]
)
gr.Markdown("""
**Supported formats:** WAV, MP3, FLAC, OGG, M4A, WEBM
*For best results, use clear audio with minimal background noise.*
""")
# Tab 3: Translation
with gr.TabItem("🌍 Translation"):
gr.Markdown("""
### Speech Translation
Upload or record audio in any language and get English translation.
Whisper will automatically detect the source language.
""")
with gr.Row():
with gr.Column():
trans_input = gr.Audio(
label="Audio Input",
sources=["microphone", "upload"],
type="numpy"
)
trans_btn = gr.Button("🌍 Translate to English", variant="primary")
with gr.Column():
trans_status = gr.Textbox(label="Status", interactive=False)
trans_metrics = gr.Markdown(label="Metrics")
trans_output = gr.Textbox(
label="English Translation",
lines=5
)
def translate_audio(audio):
return transcribe_audio(audio, "Auto-detect", "Translate to English")
trans_btn.click(
fn=translate_audio,
inputs=trans_input,
outputs=[trans_status, trans_output, trans_metrics]
)
create_footer()
if __name__ == "__main__":
demo.launch(
server_name="0.0.0.0",
server_port=7860,
show_error=True
)

95
stt.yaml Normal file
View File

@@ -0,0 +1,95 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: stt-ui
namespace: ai-ml
labels:
app: stt-ui
component: demo-ui
spec:
replicas: 1
selector:
matchLabels:
app: stt-ui
template:
metadata:
labels:
app: stt-ui
component: demo-ui
spec:
containers:
- name: gradio
image: ghcr.io/billy-davies-2/llm-apps:v2-202601271655
imagePullPolicy: Always
command: ["python", "stt.py"]
ports:
- containerPort: 7860
name: http
protocol: TCP
env:
- name: WHISPER_URL
value: "http://whisper-predictor.ai-ml.svc.cluster.local"
- name: MLFLOW_TRACKING_URI
value: "http://mlflow.mlflow.svc.cluster.local:80"
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /
port: 7860
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /
port: 7860
initialDelaySeconds: 5
periodSeconds: 10
imagePullSecrets:
- name: ghcr-registry
---
apiVersion: v1
kind: Service
metadata:
name: stt-ui
namespace: ai-ml
labels:
app: stt-ui
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 7860
protocol: TCP
name: http
selector:
app: stt-ui
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: stt-ui
namespace: ai-ml
annotations:
external-dns.alpha.kubernetes.io/hostname: stt-ui.lab.daviestechlabs.io
spec:
parentRefs:
- name: envoy-internal
namespace: network
sectionName: https
hostnames:
- stt-ui.lab.daviestechlabs.io
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: stt-ui
port: 80

382
theme.py Normal file
View File

@@ -0,0 +1,382 @@
"""
Shared Gradio theme for Davies Tech Labs AI demos.
Consistent styling across all demo applications.
Cyberpunk aesthetic - dark with yellow/gold accents.
"""
import gradio as gr
# Cyberpunk color palette
CYBER_YELLOW = "#d4a700"
CYBER_GOLD = "#ffcc00"
CYBER_DARK = "#0d0d0d"
CYBER_DARKER = "#080808"
CYBER_GRAY = "#1a1a1a"
CYBER_TEXT = "#e5e5e5"
CYBER_MUTED = "#888888"
def get_lab_theme() -> gr.Theme:
"""
Create a custom Gradio theme matching cyberpunk styling.
Dark theme with yellow/gold accents.
"""
return gr.themes.Base(
primary_hue=gr.themes.colors.yellow,
secondary_hue=gr.themes.colors.amber,
neutral_hue=gr.themes.colors.zinc,
font=[gr.themes.GoogleFont("Space Grotesk"), "ui-sans-serif", "system-ui", "sans-serif"],
font_mono=[gr.themes.GoogleFont("JetBrains Mono"), "ui-monospace", "monospace"],
).set(
# Background colors
body_background_fill=CYBER_DARK,
body_background_fill_dark=CYBER_DARKER,
background_fill_primary=CYBER_GRAY,
background_fill_primary_dark=CYBER_DARK,
background_fill_secondary=CYBER_DARKER,
background_fill_secondary_dark="#050505",
# Text colors
body_text_color=CYBER_TEXT,
body_text_color_dark=CYBER_TEXT,
body_text_color_subdued=CYBER_MUTED,
body_text_color_subdued_dark=CYBER_MUTED,
# Borders
border_color_primary=CYBER_YELLOW,
border_color_primary_dark=CYBER_YELLOW,
border_color_accent=CYBER_GOLD,
border_color_accent_dark=CYBER_GOLD,
# Buttons
button_primary_background_fill=CYBER_YELLOW,
button_primary_background_fill_dark=CYBER_YELLOW,
button_primary_background_fill_hover="#b8940a",
button_primary_background_fill_hover_dark="#b8940a",
button_primary_text_color=CYBER_DARK,
button_primary_text_color_dark=CYBER_DARK,
button_primary_border_color=CYBER_GOLD,
button_primary_border_color_dark=CYBER_GOLD,
button_secondary_background_fill="transparent",
button_secondary_background_fill_dark="transparent",
button_secondary_text_color=CYBER_YELLOW,
button_secondary_text_color_dark=CYBER_YELLOW,
button_secondary_border_color=CYBER_YELLOW,
button_secondary_border_color_dark=CYBER_YELLOW,
# Inputs
input_background_fill=CYBER_DARKER,
input_background_fill_dark=CYBER_DARKER,
input_border_color="#333333",
input_border_color_dark="#333333",
input_border_color_focus=CYBER_YELLOW,
input_border_color_focus_dark=CYBER_YELLOW,
# Shadows and effects
shadow_drop="0 4px 20px rgba(212, 167, 0, 0.15)",
shadow_drop_lg="0 8px 40px rgba(212, 167, 0, 0.2)",
# Block styling
block_background_fill=CYBER_GRAY,
block_background_fill_dark=CYBER_GRAY,
block_border_color="#2a2a2a",
block_border_color_dark="#2a2a2a",
block_label_text_color=CYBER_YELLOW,
block_label_text_color_dark=CYBER_YELLOW,
block_title_text_color=CYBER_TEXT,
block_title_text_color_dark=CYBER_TEXT,
)
# Common CSS for all demos - Cyberpunk theme
CUSTOM_CSS = """
/* Cyberpunk font import */
@import url('https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@400;500;700&family=JetBrains+Mono:wght@400;500&display=swap');
/* Root variables */
:root {
--cyber-yellow: #d4a700;
--cyber-gold: #ffcc00;
--cyber-dark: #0d0d0d;
--cyber-gray: #1a1a1a;
--cyber-text: #e5e5e5;
}
/* Container styling */
.gradio-container {
max-width: 1400px !important;
margin: auto !important;
background: var(--cyber-dark) !important;
}
/* Header/title styling - glitch effect */
.title-row, h1 {
color: var(--cyber-text) !important;
font-family: 'Space Grotesk', sans-serif !important;
font-weight: 700 !important;
text-transform: uppercase;
letter-spacing: 0.15em;
position: relative;
}
h1::after {
content: '';
position: absolute;
bottom: -8px;
left: 0;
width: 100%;
height: 2px;
background: linear-gradient(90deg, var(--cyber-yellow), transparent);
}
/* Yellow accent lines - horizontal separator */
.cyber-line {
width: 100%;
height: 2px;
background: var(--cyber-yellow);
margin: 1.5rem 0;
box-shadow: 0 0 10px var(--cyber-yellow);
}
/* Scrolling Japanese text effect */
.cyber-marquee {
overflow: hidden;
background: linear-gradient(90deg, var(--cyber-dark), transparent 5%, transparent 95%, var(--cyber-dark));
padding: 0.5rem 0;
border-top: 1px solid var(--cyber-yellow);
border-bottom: 1px solid var(--cyber-yellow);
}
.cyber-marquee-content {
display: inline-block;
white-space: nowrap;
animation: marquee 20s linear infinite;
color: var(--cyber-yellow);
font-family: 'Space Grotesk', sans-serif;
letter-spacing: 0.5em;
}
@keyframes marquee {
0% { transform: translateX(0); }
100% { transform: translateX(-50%); }
}
/* Status indicators */
.status-ok {
color: #00ff88 !important;
font-weight: 600;
text-shadow: 0 0 10px #00ff88;
}
.status-error {
color: #ff3366 !important;
font-weight: 600;
text-shadow: 0 0 10px #ff3366;
}
.status-pending {
color: var(--cyber-yellow) !important;
font-weight: 600;
text-shadow: 0 0 10px var(--cyber-yellow);
}
/* Metrics display - terminal style */
.metrics-box {
background: rgba(13, 13, 13, 0.9) !important;
border: 1px solid var(--cyber-yellow) !important;
border-radius: 0 !important;
padding: 16px !important;
font-family: 'JetBrains Mono', monospace !important;
color: var(--cyber-gold) !important;
box-shadow: 0 0 20px rgba(212, 167, 0, 0.1);
}
/* Code blocks */
.code-block, pre, code {
background: #0a0a0a !important;
border: 1px solid #333 !important;
border-left: 3px solid var(--cyber-yellow) !important;
font-family: 'JetBrains Mono', monospace !important;
}
/* Buttons - cyber style */
.gr-button-primary {
background: var(--cyber-yellow) !important;
color: var(--cyber-dark) !important;
border: none !important;
text-transform: uppercase !important;
letter-spacing: 0.1em !important;
font-weight: 600 !important;
transition: all 0.3s ease !important;
clip-path: polygon(0 0, calc(100% - 8px) 0, 100% 8px, 100% 100%, 8px 100%, 0 calc(100% - 8px));
}
.gr-button-primary:hover {
background: var(--cyber-gold) !important;
box-shadow: 0 0 30px rgba(212, 167, 0, 0.5) !important;
transform: translateY(-2px);
}
.gr-button-secondary {
background: transparent !important;
color: var(--cyber-yellow) !important;
border: 1px solid var(--cyber-yellow) !important;
text-transform: uppercase !important;
letter-spacing: 0.1em !important;
}
/* Input fields */
.gr-input, .gr-textbox, textarea, input {
background: #0a0a0a !important;
border: 1px solid #333 !important;
color: var(--cyber-text) !important;
border-radius: 0 !important;
transition: border-color 0.3s ease !important;
}
.gr-input:focus, .gr-textbox:focus, textarea:focus, input:focus {
border-color: var(--cyber-yellow) !important;
box-shadow: 0 0 10px rgba(212, 167, 0, 0.3) !important;
}
/* Tabs - angular cyber style */
.gr-tab-nav {
border-bottom: 2px solid var(--cyber-yellow) !important;
}
.gr-tab {
background: transparent !important;
color: var(--cyber-muted) !important;
border: none !important;
text-transform: uppercase !important;
letter-spacing: 0.1em !important;
}
.gr-tab.selected {
color: var(--cyber-yellow) !important;
background: rgba(212, 167, 0, 0.1) !important;
}
/* Accordion */
.gr-accordion {
border: 1px solid #333 !important;
background: var(--cyber-gray) !important;
}
/* Labels and text */
label, .gr-label {
color: var(--cyber-yellow) !important;
text-transform: uppercase !important;
font-size: 0.75rem !important;
letter-spacing: 0.1em !important;
}
/* Slider styling */
.gr-slider {
--slider-color: var(--cyber-yellow) !important;
}
/* Footer - cyber style */
.footer {
text-align: center;
color: #666;
font-size: 0.8rem;
padding: 1.5rem;
border-top: 1px solid #333;
margin-top: 2rem;
font-family: 'JetBrains Mono', monospace;
letter-spacing: 0.05em;
}
.footer a {
color: var(--cyber-yellow);
text-decoration: none;
transition: all 0.3s ease;
}
.footer a:hover {
text-shadow: 0 0 10px var(--cyber-yellow);
}
/* Cyber badge/tag */
.cyber-badge {
display: inline-block;
padding: 4px 12px;
background: transparent;
border: 1px solid var(--cyber-yellow);
color: var(--cyber-yellow);
font-size: 0.7rem;
text-transform: uppercase;
letter-spacing: 0.1em;
font-family: 'JetBrains Mono', monospace;
}
/* Progress bars */
.progress-bar {
background: #1a1a1a !important;
border: 1px solid #333 !important;
}
.progress-bar-fill {
background: linear-gradient(90deg, var(--cyber-yellow), var(--cyber-gold)) !important;
}
/* Scrollbar styling */
::-webkit-scrollbar {
width: 8px;
height: 8px;
}
::-webkit-scrollbar-track {
background: var(--cyber-dark);
}
::-webkit-scrollbar-thumb {
background: #333;
border: 1px solid var(--cyber-yellow);
}
::-webkit-scrollbar-thumb:hover {
background: #444;
}
/* Glowing text effect utility */
.glow-text {
text-shadow: 0 0 10px var(--cyber-yellow), 0 0 20px var(--cyber-yellow);
}
"""
def create_header(title: str, description: str) -> gr.Markdown:
"""Create a cyberpunk-style header for demo apps."""
# Japanese text for marquee effect
jp_text = "サイバー · コマース · フューチャー · "
return gr.Markdown(f"""
<div style="margin-bottom: 2rem;">
<div style="display: flex; justify-content: space-between; align-items: center; margin-bottom: 1rem;">
<span class="cyber-badge">STORE</span>
<span class="cyber-badge">v2.0</span>
<span class="cyber-badge">ONLINE</span>
</div>
<h1 style="font-size: 3rem; margin: 0; letter-spacing: 0.2em;">{title.upper()}</h1>
<p style="color: #888; margin-top: 0.5rem; font-family: 'JetBrains Mono', monospace; font-size: 0.9rem;">{description}</p>
<div class="cyber-line"></div>
<div class="cyber-marquee">
<span class="cyber-marquee-content">{jp_text * 8}</span>
</div>
</div>
""")
def create_footer() -> gr.Markdown:
"""Create a cyberpunk-style footer for demo apps."""
return gr.Markdown("""
<div class="cyber-line"></div>
<div class="footer">
<span class="cyber-badge" style="margin-right: 1rem;">AR</span>
<span style="color: #666;">DAVIES TECH LABS</span>
<span style="color: #444; margin: 0 1rem;">·</span>
<a href="https://mlflow.lab.daviestechlabs.io" target="_blank">MLFLOW</a>
<span style="color: #444; margin: 0 1rem;">·</span>
<a href="https://kubeflow.lab.daviestechlabs.io" target="_blank">KUBEFLOW</a>
<span style="color: #444; margin: 0 1rem;">·</span>
<span style="color: #666;">[ スクロール ]</span>
</div>
""")

272
tts.py Normal file
View File

@@ -0,0 +1,272 @@
#!/usr/bin/env python3
"""
TTS Demo - Gradio UI for testing Text-to-Speech service.
Features:
- Text input with language selection
- Audio playback of synthesized speech
- Voice/speaker selection (when available)
- MLflow metrics logging
- Multiple TTS backends support (Coqui XTTS, Piper, etc.)
"""
import os
import time
import logging
import io
import base64
import gradio as gr
import httpx
import soundfile as sf
import numpy as np
from theme import get_lab_theme, CUSTOM_CSS, create_footer
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("tts-demo")
# Configuration
TTS_URL = os.environ.get(
"TTS_URL",
"http://tts-predictor.ai-ml.svc.cluster.local"
)
MLFLOW_TRACKING_URI = os.environ.get(
"MLFLOW_TRACKING_URI",
"http://mlflow.mlflow.svc.cluster.local:80"
)
# HTTP client with longer timeout for audio generation
client = httpx.Client(timeout=120.0)
# Supported languages for XTTS
LANGUAGES = {
"English": "en",
"Spanish": "es",
"French": "fr",
"German": "de",
"Italian": "it",
"Portuguese": "pt",
"Polish": "pl",
"Turkish": "tr",
"Russian": "ru",
"Dutch": "nl",
"Czech": "cs",
"Arabic": "ar",
"Chinese": "zh-cn",
"Japanese": "ja",
"Korean": "ko",
"Hungarian": "hu",
}
def synthesize_speech(text: str, language: str) -> tuple[str, tuple[int, np.ndarray] | None, str]:
"""Synthesize speech from text using the TTS service."""
if not text.strip():
return "❌ Please enter some text", None, ""
lang_code = LANGUAGES.get(language, "en")
try:
start_time = time.time()
# Call TTS service (Coqui XTTS API format)
response = client.get(
f"{TTS_URL}/api/tts",
params={"text": text, "language_id": lang_code}
)
response.raise_for_status()
latency = time.time() - start_time
audio_bytes = response.content
# Parse audio data
audio_io = io.BytesIO(audio_bytes)
audio_data, sample_rate = sf.read(audio_io)
# Calculate duration
if len(audio_data.shape) == 1:
duration = len(audio_data) / sample_rate
else:
duration = len(audio_data) / sample_rate
# Status message
status = f"✅ Generated {duration:.2f}s of audio in {latency*1000:.0f}ms"
# Metrics
metrics = f"""
**Audio Statistics:**
- Duration: {duration:.2f} seconds
- Sample Rate: {sample_rate} Hz
- Size: {len(audio_bytes) / 1024:.1f} KB
- Generation Time: {latency*1000:.0f}ms
- Real-time Factor: {latency/duration:.2f}x
- Language: {language} ({lang_code})
- Characters: {len(text)}
- Chars/sec: {len(text)/latency:.1f}
"""
return status, (sample_rate, audio_data), metrics
except httpx.HTTPStatusError as e:
logger.exception("TTS request failed")
return f"❌ TTS service error: {e.response.status_code}", None, ""
except Exception as e:
logger.exception("TTS synthesis failed")
return f"❌ Error: {str(e)}", None, ""
def check_service_health() -> str:
"""Check if the TTS service is healthy."""
try:
# Try the health endpoint first
response = client.get(f"{TTS_URL}/health", timeout=5.0)
if response.status_code == 200:
return "🟢 Service is healthy"
# Fall back to root endpoint
response = client.get(f"{TTS_URL}/", timeout=5.0)
if response.status_code == 200:
return "🟢 Service is responding"
return f"🟡 Service returned status {response.status_code}"
except Exception as e:
return f"🔴 Service unavailable: {str(e)}"
# Build the Gradio app
with gr.Blocks(theme=get_lab_theme(), css=CUSTOM_CSS, title="TTS Demo") as demo:
gr.Markdown("""
# 🔊 Text-to-Speech Demo
Test the **Coqui XTTS** text-to-speech service. Convert text to natural-sounding speech
in multiple languages.
""")
# Service status
with gr.Row():
health_btn = gr.Button("🔄 Check Service", size="sm")
health_status = gr.Textbox(label="Service Status", interactive=False)
health_btn.click(fn=check_service_health, outputs=health_status)
with gr.Tabs():
# Tab 1: Basic TTS
with gr.TabItem("🎤 Text to Speech"):
with gr.Row():
with gr.Column(scale=2):
text_input = gr.Textbox(
label="Text to Synthesize",
placeholder="Enter text to convert to speech...",
lines=5,
max_lines=10
)
with gr.Row():
language = gr.Dropdown(
choices=list(LANGUAGES.keys()),
value="English",
label="Language"
)
synthesize_btn = gr.Button("🔊 Synthesize", variant="primary", scale=2)
with gr.Column(scale=1):
status_output = gr.Textbox(label="Status", interactive=False)
metrics_output = gr.Markdown(label="Metrics")
audio_output = gr.Audio(label="Generated Audio", type="numpy")
synthesize_btn.click(
fn=synthesize_speech,
inputs=[text_input, language],
outputs=[status_output, audio_output, metrics_output]
)
# Example texts
gr.Examples(
examples=[
["Hello! Welcome to Davies Tech Labs. This is a demonstration of our text-to-speech system.", "English"],
["The quick brown fox jumps over the lazy dog. This sentence contains every letter of the alphabet.", "English"],
["Bonjour! Bienvenue au laboratoire technique de Davies.", "French"],
["Hola! Bienvenido al laboratorio de tecnología.", "Spanish"],
["Guten Tag! Willkommen im Techniklabor.", "German"],
],
inputs=[text_input, language],
)
# Tab 2: Comparison
with gr.TabItem("🔄 Language Comparison"):
gr.Markdown("Compare the same text in different languages.")
compare_text = gr.Textbox(
label="Text to Compare",
value="Hello, how are you today?",
lines=2
)
with gr.Row():
lang1 = gr.Dropdown(choices=list(LANGUAGES.keys()), value="English", label="Language 1")
lang2 = gr.Dropdown(choices=list(LANGUAGES.keys()), value="Spanish", label="Language 2")
compare_btn = gr.Button("Compare Languages", variant="primary")
with gr.Row():
with gr.Column():
gr.Markdown("### Language 1")
audio1 = gr.Audio(label="Audio 1", type="numpy")
status1 = gr.Textbox(label="Status", interactive=False)
with gr.Column():
gr.Markdown("### Language 2")
audio2 = gr.Audio(label="Audio 2", type="numpy")
status2 = gr.Textbox(label="Status", interactive=False)
def compare_languages(text, l1, l2):
s1, a1, _ = synthesize_speech(text, l1)
s2, a2, _ = synthesize_speech(text, l2)
return s1, a1, s2, a2
compare_btn.click(
fn=compare_languages,
inputs=[compare_text, lang1, lang2],
outputs=[status1, audio1, status2, audio2]
)
# Tab 3: Batch Processing
with gr.TabItem("📚 Batch Synthesis"):
gr.Markdown("Synthesize multiple texts at once (one per line).")
batch_input = gr.Textbox(
label="Texts (one per line)",
placeholder="Enter multiple texts, one per line...",
lines=6
)
batch_lang = gr.Dropdown(
choices=list(LANGUAGES.keys()),
value="English",
label="Language"
)
batch_btn = gr.Button("Synthesize All", variant="primary")
batch_status = gr.Textbox(label="Status", interactive=False)
batch_audios = gr.Dataset(
components=[gr.Audio(type="numpy")],
label="Generated Audio Files"
)
# Note: Batch processing would need more complex handling
# This is a simplified version
gr.Markdown("""
*Note: For batch processing of many texts, consider using the API directly
or the Kubeflow pipeline for better throughput.*
""")
create_footer()
if __name__ == "__main__":
demo.launch(
server_name="0.0.0.0",
server_port=7860,
show_error=True
)

95
tts.yaml Normal file
View File

@@ -0,0 +1,95 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: tts-ui
namespace: ai-ml
labels:
app: tts-ui
component: demo-ui
spec:
replicas: 1
selector:
matchLabels:
app: tts-ui
template:
metadata:
labels:
app: tts-ui
component: demo-ui
spec:
containers:
- name: gradio
image: ghcr.io/billy-davies-2/llm-apps:v2-202601271655
imagePullPolicy: Always
command: ["python", "tts.py"]
ports:
- containerPort: 7860
name: http
protocol: TCP
env:
- name: TTS_URL
value: "http://tts-predictor.ai-ml.svc.cluster.local"
- name: MLFLOW_TRACKING_URI
value: "http://mlflow.mlflow.svc.cluster.local:80"
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /
port: 7860
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /
port: 7860
initialDelaySeconds: 5
periodSeconds: 10
imagePullSecrets:
- name: ghcr-registry
---
apiVersion: v1
kind: Service
metadata:
name: tts-ui
namespace: ai-ml
labels:
app: tts-ui
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 7860
protocol: TCP
name: http
selector:
app: tts-ui
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: tts-ui
namespace: ai-ml
annotations:
external-dns.alpha.kubernetes.io/hostname: tts-ui.lab.daviestechlabs.io
spec:
parentRefs:
- name: envoy-internal
namespace: network
sectionName: https
hostnames:
- tts-ui.lab.daviestechlabs.io
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: tts-ui
port: 80