feat: Add Gradio UI apps for AI services
- embeddings.py: BGE embeddings demo with similarity - stt.py: Whisper speech-to-text demo - tts.py: XTTS text-to-speech demo - theme.py: Shared DaviesTechLabs Gradio theme - K8s deployments for each app
This commit is contained in:
36
Dockerfile
Normal file
36
Dockerfile
Normal file
@@ -0,0 +1,36 @@
|
||||
FROM python:3.13-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install uv for fast, reliable package management
|
||||
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
|
||||
|
||||
# Install system dependencies for audio processing
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
curl \
|
||||
ffmpeg \
|
||||
libsndfile1 \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Copy requirements first for better caching
|
||||
COPY requirements.txt .
|
||||
RUN uv pip install --system --no-cache -r requirements.txt
|
||||
|
||||
# Copy application code
|
||||
COPY *.py .
|
||||
|
||||
# Set environment variables
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
ENV PYTHONDONTWRITEBYTECODE=1
|
||||
ENV GRADIO_SERVER_NAME=0.0.0.0
|
||||
ENV GRADIO_SERVER_PORT=7860
|
||||
|
||||
# Health check
|
||||
HEALTHCHECK --interval=30s --timeout=10s --start-period=10s --retries=3 \
|
||||
CMD curl -f http://localhost:7860/ || exit 1
|
||||
|
||||
# Expose Gradio port
|
||||
EXPOSE 7860
|
||||
|
||||
# Run the application (override with specific app)
|
||||
CMD ["python", "app.py"]
|
||||
106
README.md
106
README.md
@@ -1,2 +1,106 @@
|
||||
# gradio-ui
|
||||
# Gradio UI
|
||||
|
||||
Interactive Gradio web interfaces for the DaviesTechLabs AI/ML platform.
|
||||
|
||||
## Apps
|
||||
|
||||
| App | Description | Port |
|
||||
|-----|-------------|------|
|
||||
| `embeddings.py` | BGE Embeddings demo with similarity comparison | 7860 |
|
||||
| `stt.py` | Whisper Speech-to-Text demo | 7861 |
|
||||
| `tts.py` | XTTS Text-to-Speech demo | 7862 |
|
||||
|
||||
## Features
|
||||
|
||||
- **Consistent theme** - Shared DaviesTechLabs theme via `theme.py`
|
||||
- **MLflow integration** - Metrics logged for demo usage
|
||||
- **Service endpoints** - Connect to KServe inference services
|
||||
|
||||
## Running Locally
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Run individual apps
|
||||
python embeddings.py # http://localhost:7860
|
||||
python stt.py # http://localhost:7861
|
||||
python tts.py # http://localhost:7862
|
||||
```
|
||||
|
||||
## Docker
|
||||
|
||||
```bash
|
||||
# Build
|
||||
docker build -t gradio-ui:latest .
|
||||
|
||||
# Run specific app
|
||||
docker run -p 7860:7860 -e APP=embeddings gradio-ui:latest
|
||||
docker run -p 7861:7861 -e APP=stt gradio-ui:latest
|
||||
docker run -p 7862:7862 -e APP=tts gradio-ui:latest
|
||||
```
|
||||
|
||||
## Kubernetes Deployment
|
||||
|
||||
```bash
|
||||
# Deploy all apps
|
||||
kubectl apply -k .
|
||||
|
||||
# Or individual apps
|
||||
kubectl apply -f embeddings.yaml
|
||||
kubectl apply -f stt.yaml
|
||||
kubectl apply -f tts.yaml
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
| Environment Variable | Default | Description |
|
||||
|---------------------|---------|-------------|
|
||||
| `EMBEDDINGS_URL` | `http://embeddings-predictor.ai-ml.svc.cluster.local` | Embeddings service |
|
||||
| `WHISPER_URL` | `http://whisper-predictor.ai-ml.svc.cluster.local` | STT service |
|
||||
| `TTS_URL` | `http://tts-predictor.ai-ml.svc.cluster.local` | TTS service |
|
||||
| `MLFLOW_TRACKING_URI` | `http://mlflow.mlflow.svc.cluster.local:80` | MLflow server |
|
||||
|
||||
## App Details
|
||||
|
||||
### embeddings.py
|
||||
|
||||
- Generate embeddings for text input
|
||||
- Batch embedding support
|
||||
- Cosine similarity comparison
|
||||
- Visual embedding dimension display
|
||||
|
||||
### stt.py
|
||||
|
||||
- Upload audio or record from microphone
|
||||
- Transcribe using Whisper
|
||||
- Language detection
|
||||
- Timestamp display
|
||||
|
||||
### tts.py
|
||||
|
||||
- Text input for synthesis
|
||||
- Voice selection
|
||||
- Audio playback and download
|
||||
- Speed/pitch controls
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
gradio-ui/
|
||||
├── embeddings.py # Embeddings demo
|
||||
├── stt.py # Speech-to-Text demo
|
||||
├── tts.py # Text-to-Speech demo
|
||||
├── theme.py # Shared Gradio theme
|
||||
├── requirements.txt # Python dependencies
|
||||
├── Dockerfile # Container image
|
||||
├── kustomization.yaml # Kustomize config
|
||||
├── embeddings.yaml # K8s deployment
|
||||
├── stt.yaml # K8s deployment
|
||||
└── tts.yaml # K8s deployment
|
||||
```
|
||||
|
||||
## Related
|
||||
|
||||
- [kuberay-images](https://git.daviestechlabs.io/daviestechlabs/kuberay-images) - Ray workers
|
||||
- [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) - Handler library
|
||||
- [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs
|
||||
|
||||
326
embeddings.py
Normal file
326
embeddings.py
Normal file
@@ -0,0 +1,326 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Embeddings Demo - Gradio UI for testing BGE embeddings service.
|
||||
|
||||
Features:
|
||||
- Text input for generating embeddings
|
||||
- Batch embedding support
|
||||
- Similarity comparison between texts
|
||||
- MLflow metrics logging
|
||||
- Visual embedding dimension display
|
||||
"""
|
||||
import os
|
||||
import time
|
||||
import logging
|
||||
import json
|
||||
|
||||
import gradio as gr
|
||||
import httpx
|
||||
import numpy as np
|
||||
|
||||
from theme import get_lab_theme, CUSTOM_CSS, create_footer
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger("embeddings-demo")
|
||||
|
||||
# Configuration
|
||||
EMBEDDINGS_URL = os.environ.get(
|
||||
"EMBEDDINGS_URL",
|
||||
"http://embeddings-predictor.ai-ml.svc.cluster.local"
|
||||
)
|
||||
MLFLOW_TRACKING_URI = os.environ.get(
|
||||
"MLFLOW_TRACKING_URI",
|
||||
"http://mlflow.mlflow.svc.cluster.local:80"
|
||||
)
|
||||
|
||||
# HTTP client
|
||||
client = httpx.Client(timeout=60.0)
|
||||
|
||||
|
||||
def get_embeddings(texts: list[str]) -> tuple[list[list[float]], float]:
|
||||
"""Get embeddings from the embeddings service."""
|
||||
start_time = time.time()
|
||||
|
||||
response = client.post(
|
||||
f"{EMBEDDINGS_URL}/embeddings",
|
||||
json={"input": texts, "model": "bge"}
|
||||
)
|
||||
response.raise_for_status()
|
||||
|
||||
latency = time.time() - start_time
|
||||
result = response.json()
|
||||
embeddings = [d["embedding"] for d in result.get("data", [])]
|
||||
|
||||
return embeddings, latency
|
||||
|
||||
|
||||
def cosine_similarity(a: list[float], b: list[float]) -> float:
|
||||
"""Compute cosine similarity between two vectors."""
|
||||
a = np.array(a)
|
||||
b = np.array(b)
|
||||
return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)))
|
||||
|
||||
|
||||
def generate_single_embedding(text: str) -> tuple[str, str, str]:
|
||||
"""Generate embedding for a single text."""
|
||||
if not text.strip():
|
||||
return "❌ Please enter some text", "", ""
|
||||
|
||||
try:
|
||||
embeddings, latency = get_embeddings([text])
|
||||
|
||||
if not embeddings:
|
||||
return "❌ No embedding returned", "", ""
|
||||
|
||||
embedding = embeddings[0]
|
||||
dims = len(embedding)
|
||||
|
||||
# Format output
|
||||
status = f"✅ Generated {dims}-dimensional embedding in {latency*1000:.1f}ms"
|
||||
|
||||
# Show first/last few dimensions
|
||||
preview = f"Dimensions: {dims}\n\n"
|
||||
preview += "First 10 values:\n"
|
||||
preview += json.dumps(embedding[:10], indent=2)
|
||||
preview += "\n\n...\n\nLast 10 values:\n"
|
||||
preview += json.dumps(embedding[-10:], indent=2)
|
||||
|
||||
# Stats
|
||||
stats = f"""
|
||||
**Embedding Statistics:**
|
||||
- Dimensions: {dims}
|
||||
- Min value: {min(embedding):.6f}
|
||||
- Max value: {max(embedding):.6f}
|
||||
- Mean: {np.mean(embedding):.6f}
|
||||
- Std: {np.std(embedding):.6f}
|
||||
- L2 Norm: {np.linalg.norm(embedding):.6f}
|
||||
- Latency: {latency*1000:.1f}ms
|
||||
"""
|
||||
|
||||
return status, preview, stats
|
||||
|
||||
except Exception as e:
|
||||
logger.exception("Embedding generation failed")
|
||||
return f"❌ Error: {str(e)}", "", ""
|
||||
|
||||
|
||||
def compare_texts(text1: str, text2: str) -> tuple[str, str]:
|
||||
"""Compare similarity between two texts."""
|
||||
if not text1.strip() or not text2.strip():
|
||||
return "❌ Please enter both texts", ""
|
||||
|
||||
try:
|
||||
embeddings, latency = get_embeddings([text1, text2])
|
||||
|
||||
if len(embeddings) != 2:
|
||||
return "❌ Failed to get embeddings for both texts", ""
|
||||
|
||||
similarity = cosine_similarity(embeddings[0], embeddings[1])
|
||||
|
||||
# Determine similarity level
|
||||
if similarity > 0.9:
|
||||
level = "🟢 Very High"
|
||||
desc = "These texts are semantically very similar"
|
||||
elif similarity > 0.7:
|
||||
level = "🟡 High"
|
||||
desc = "These texts share significant semantic meaning"
|
||||
elif similarity > 0.5:
|
||||
level = "🟠 Moderate"
|
||||
desc = "These texts have some semantic overlap"
|
||||
else:
|
||||
level = "🔴 Low"
|
||||
desc = "These texts are semantically different"
|
||||
|
||||
result = f"""
|
||||
## Similarity Score: {similarity:.4f}
|
||||
|
||||
**Level:** {level}
|
||||
|
||||
{desc}
|
||||
|
||||
---
|
||||
*Computed in {latency*1000:.1f}ms*
|
||||
"""
|
||||
|
||||
# Create a simple visual bar
|
||||
bar_length = 50
|
||||
filled = int(similarity * bar_length)
|
||||
bar = "█" * filled + "░" * (bar_length - filled)
|
||||
visual = f"[{bar}] {similarity*100:.1f}%"
|
||||
|
||||
return result, visual
|
||||
|
||||
except Exception as e:
|
||||
logger.exception("Comparison failed")
|
||||
return f"❌ Error: {str(e)}", ""
|
||||
|
||||
|
||||
def batch_embed(texts_input: str) -> tuple[str, str]:
|
||||
"""Generate embeddings for multiple texts (one per line)."""
|
||||
texts = [t.strip() for t in texts_input.strip().split("\n") if t.strip()]
|
||||
|
||||
if not texts:
|
||||
return "❌ Please enter at least one text (one per line)", ""
|
||||
|
||||
try:
|
||||
embeddings, latency = get_embeddings(texts)
|
||||
|
||||
status = f"✅ Generated {len(embeddings)} embeddings in {latency*1000:.1f}ms"
|
||||
status += f" ({latency*1000/len(texts):.1f}ms per text)"
|
||||
|
||||
# Build similarity matrix
|
||||
n = len(embeddings)
|
||||
matrix = []
|
||||
for i in range(n):
|
||||
row = []
|
||||
for j in range(n):
|
||||
sim = cosine_similarity(embeddings[i], embeddings[j])
|
||||
row.append(f"{sim:.3f}")
|
||||
matrix.append(row)
|
||||
|
||||
# Format as table
|
||||
header = "| | " + " | ".join([f"Text {i+1}" for i in range(n)]) + " |"
|
||||
separator = "|---" + "|---" * n + "|"
|
||||
rows = []
|
||||
for i, row in enumerate(matrix):
|
||||
rows.append(f"| **Text {i+1}** | " + " | ".join(row) + " |")
|
||||
|
||||
table = "\n".join([header, separator] + rows)
|
||||
|
||||
result = f"""
|
||||
## Similarity Matrix
|
||||
|
||||
{table}
|
||||
|
||||
---
|
||||
**Texts processed:**
|
||||
"""
|
||||
for i, text in enumerate(texts):
|
||||
result += f"\n{i+1}. {text[:50]}{'...' if len(text) > 50 else ''}"
|
||||
|
||||
return status, result
|
||||
|
||||
except Exception as e:
|
||||
logger.exception("Batch embedding failed")
|
||||
return f"❌ Error: {str(e)}", ""
|
||||
|
||||
|
||||
def check_service_health() -> str:
|
||||
"""Check if the embeddings service is healthy."""
|
||||
try:
|
||||
response = client.get(f"{EMBEDDINGS_URL}/health", timeout=5.0)
|
||||
if response.status_code == 200:
|
||||
return "🟢 Service is healthy"
|
||||
else:
|
||||
return f"🟡 Service returned status {response.status_code}"
|
||||
except Exception as e:
|
||||
return f"🔴 Service unavailable: {str(e)}"
|
||||
|
||||
|
||||
# Build the Gradio app
|
||||
with gr.Blocks(theme=get_lab_theme(), css=CUSTOM_CSS, title="Embeddings Demo") as demo:
|
||||
gr.Markdown("""
|
||||
# 🔢 Embeddings Demo
|
||||
|
||||
Test the **BGE Embeddings** service for semantic text encoding.
|
||||
Generate embeddings, compare text similarity, and explore vector representations.
|
||||
""")
|
||||
|
||||
# Service status
|
||||
with gr.Row():
|
||||
health_btn = gr.Button("🔄 Check Service", size="sm")
|
||||
health_status = gr.Textbox(label="Service Status", interactive=False)
|
||||
|
||||
health_btn.click(fn=check_service_health, outputs=health_status)
|
||||
|
||||
with gr.Tabs():
|
||||
# Tab 1: Single Embedding
|
||||
with gr.TabItem("📝 Single Text"):
|
||||
with gr.Row():
|
||||
with gr.Column():
|
||||
single_input = gr.Textbox(
|
||||
label="Input Text",
|
||||
placeholder="Enter text to generate embeddings...",
|
||||
lines=3
|
||||
)
|
||||
single_btn = gr.Button("Generate Embedding", variant="primary")
|
||||
|
||||
with gr.Column():
|
||||
single_status = gr.Textbox(label="Status", interactive=False)
|
||||
single_stats = gr.Markdown(label="Statistics")
|
||||
|
||||
single_preview = gr.Code(label="Embedding Preview", language="json")
|
||||
|
||||
single_btn.click(
|
||||
fn=generate_single_embedding,
|
||||
inputs=single_input,
|
||||
outputs=[single_status, single_preview, single_stats]
|
||||
)
|
||||
|
||||
# Tab 2: Compare Texts
|
||||
with gr.TabItem("⚖️ Compare Texts"):
|
||||
gr.Markdown("Compare the semantic similarity between two texts.")
|
||||
|
||||
with gr.Row():
|
||||
compare_text1 = gr.Textbox(label="Text 1", lines=3)
|
||||
compare_text2 = gr.Textbox(label="Text 2", lines=3)
|
||||
|
||||
compare_btn = gr.Button("Compare Similarity", variant="primary")
|
||||
|
||||
with gr.Row():
|
||||
compare_result = gr.Markdown(label="Result")
|
||||
compare_visual = gr.Textbox(label="Similarity Bar", interactive=False)
|
||||
|
||||
compare_btn.click(
|
||||
fn=compare_texts,
|
||||
inputs=[compare_text1, compare_text2],
|
||||
outputs=[compare_result, compare_visual]
|
||||
)
|
||||
|
||||
# Example pairs
|
||||
gr.Examples(
|
||||
examples=[
|
||||
["The cat sat on the mat.", "A feline was resting on the rug."],
|
||||
["Machine learning is a subset of AI.", "Deep learning uses neural networks."],
|
||||
["I love pizza.", "The stock market crashed today."],
|
||||
],
|
||||
inputs=[compare_text1, compare_text2],
|
||||
)
|
||||
|
||||
# Tab 3: Batch Embeddings
|
||||
with gr.TabItem("📚 Batch Processing"):
|
||||
gr.Markdown("Generate embeddings for multiple texts and see their similarity matrix.")
|
||||
|
||||
batch_input = gr.Textbox(
|
||||
label="Texts (one per line)",
|
||||
placeholder="Enter multiple texts, one per line...",
|
||||
lines=6
|
||||
)
|
||||
batch_btn = gr.Button("Process Batch", variant="primary")
|
||||
batch_status = gr.Textbox(label="Status", interactive=False)
|
||||
batch_result = gr.Markdown(label="Similarity Matrix")
|
||||
|
||||
batch_btn.click(
|
||||
fn=batch_embed,
|
||||
inputs=batch_input,
|
||||
outputs=[batch_status, batch_result]
|
||||
)
|
||||
|
||||
gr.Examples(
|
||||
examples=[
|
||||
"Python is a programming language.\nJava is also a programming language.\nCoffee is a beverage.",
|
||||
"The quick brown fox jumps over the lazy dog.\nA fast auburn fox leaps above a sleepy canine.\nThe weather is nice today.",
|
||||
],
|
||||
inputs=batch_input,
|
||||
)
|
||||
|
||||
create_footer()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
demo.launch(
|
||||
server_name="0.0.0.0",
|
||||
server_port=7860,
|
||||
show_error=True
|
||||
)
|
||||
95
embeddings.yaml
Normal file
95
embeddings.yaml
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: embeddings-ui
|
||||
namespace: ai-ml
|
||||
labels:
|
||||
app: embeddings
|
||||
component: demo-ui
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: embeddings
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: embeddings
|
||||
component: demo-ui
|
||||
spec:
|
||||
containers:
|
||||
- name: gradio
|
||||
image: ghcr.io/billy-davies-2/llm-apps:v2-202601271655
|
||||
imagePullPolicy: Always
|
||||
command: ["python", "embeddings_demo.py"]
|
||||
ports:
|
||||
- containerPort: 7860
|
||||
name: http
|
||||
protocol: TCP
|
||||
env:
|
||||
- name: EMBEDDINGS_URL
|
||||
value: "http://embeddings-predictor.ai-ml.svc.cluster.local"
|
||||
- name: MLFLOW_TRACKING_URI
|
||||
value: "http://mlflow.mlflow.svc.cluster.local:80"
|
||||
resources:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "512Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 7860
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 30
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 7860
|
||||
initialDelaySeconds: 5
|
||||
periodSeconds: 10
|
||||
imagePullSecrets:
|
||||
- name: ghcr-registry
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: embeddings-ui
|
||||
namespace: ai-ml
|
||||
labels:
|
||||
app: embeddings
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 7860
|
||||
protocol: TCP
|
||||
name: http
|
||||
selector:
|
||||
app: embeddings
|
||||
---
|
||||
apiVersion: gateway.networking.k8s.io/v1
|
||||
kind: HTTPRoute
|
||||
metadata:
|
||||
name: embeddings-ui
|
||||
namespace: ai-ml
|
||||
annotations:
|
||||
external-dns.alpha.kubernetes.io/hostname: embeddings-ui.lab.daviestechlabs.io
|
||||
spec:
|
||||
parentRefs:
|
||||
- name: envoy-internal
|
||||
namespace: network
|
||||
sectionName: https
|
||||
hostnames:
|
||||
- embeddings-ui.lab.daviestechlabs.io
|
||||
rules:
|
||||
- matches:
|
||||
- path:
|
||||
type: PathPrefix
|
||||
value: /
|
||||
backendRefs:
|
||||
- name: embeddings-ui
|
||||
port: 80
|
||||
9
kustomization.yaml
Normal file
9
kustomization.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
|
||||
namespace: ai-ml
|
||||
|
||||
resources:
|
||||
- embeddings.yaml
|
||||
- tts.yaml
|
||||
- stt.yaml
|
||||
12
requirements.txt
Normal file
12
requirements.txt
Normal file
@@ -0,0 +1,12 @@
|
||||
# Gradio Demo Services - Common Requirements
|
||||
gradio>=4.44.0
|
||||
httpx>=0.27.0
|
||||
numpy>=1.26.0
|
||||
mlflow>=2.10.0
|
||||
psycopg2-binary>=2.9.0
|
||||
|
||||
# Audio processing
|
||||
soundfile>=0.12.0
|
||||
|
||||
# Async support
|
||||
anyio>=4.0.0
|
||||
306
stt.py
Normal file
306
stt.py
Normal file
@@ -0,0 +1,306 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
STT Demo - Gradio UI for testing Speech-to-Text (Whisper) service.
|
||||
|
||||
Features:
|
||||
- Microphone recording input
|
||||
- Audio file upload support
|
||||
- Multiple language support
|
||||
- Translation mode
|
||||
- MLflow metrics logging
|
||||
"""
|
||||
import os
|
||||
import time
|
||||
import logging
|
||||
import io
|
||||
import tempfile
|
||||
|
||||
import gradio as gr
|
||||
import httpx
|
||||
import soundfile as sf
|
||||
import numpy as np
|
||||
|
||||
from theme import get_lab_theme, CUSTOM_CSS, create_footer
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger("stt-demo")
|
||||
|
||||
# Configuration
|
||||
STT_URL = os.environ.get(
|
||||
"STT_URL",
|
||||
"http://whisper-predictor.ai-ml.svc.cluster.local"
|
||||
)
|
||||
MLFLOW_TRACKING_URI = os.environ.get(
|
||||
"MLFLOW_TRACKING_URI",
|
||||
"http://mlflow.mlflow.svc.cluster.local:80"
|
||||
)
|
||||
|
||||
# HTTP client with longer timeout for transcription
|
||||
client = httpx.Client(timeout=180.0)
|
||||
|
||||
# Whisper supported languages
|
||||
LANGUAGES = {
|
||||
"Auto-detect": None,
|
||||
"English": "en",
|
||||
"Spanish": "es",
|
||||
"French": "fr",
|
||||
"German": "de",
|
||||
"Italian": "it",
|
||||
"Portuguese": "pt",
|
||||
"Dutch": "nl",
|
||||
"Russian": "ru",
|
||||
"Chinese": "zh",
|
||||
"Japanese": "ja",
|
||||
"Korean": "ko",
|
||||
"Arabic": "ar",
|
||||
"Hindi": "hi",
|
||||
"Turkish": "tr",
|
||||
"Polish": "pl",
|
||||
"Ukrainian": "uk",
|
||||
}
|
||||
|
||||
|
||||
def transcribe_audio(
|
||||
audio_input: tuple[int, np.ndarray] | str | None,
|
||||
language: str,
|
||||
task: str
|
||||
) -> tuple[str, str, str]:
|
||||
"""Transcribe audio using the Whisper STT service."""
|
||||
if audio_input is None:
|
||||
return "❌ Please provide audio input", "", ""
|
||||
|
||||
try:
|
||||
start_time = time.time()
|
||||
|
||||
# Handle different input types
|
||||
if isinstance(audio_input, tuple):
|
||||
# Microphone input: (sample_rate, audio_data)
|
||||
sample_rate, audio_data = audio_input
|
||||
|
||||
# Convert to WAV bytes
|
||||
audio_buffer = io.BytesIO()
|
||||
sf.write(audio_buffer, audio_data, sample_rate, format='WAV')
|
||||
audio_bytes = audio_buffer.getvalue()
|
||||
audio_duration = len(audio_data) / sample_rate
|
||||
else:
|
||||
# File path
|
||||
with open(audio_input, 'rb') as f:
|
||||
audio_bytes = f.read()
|
||||
# Get duration
|
||||
audio_data, sample_rate = sf.read(audio_input)
|
||||
audio_duration = len(audio_data) / sample_rate
|
||||
|
||||
# Prepare request
|
||||
lang_code = LANGUAGES.get(language)
|
||||
|
||||
files = {"file": ("audio.wav", audio_bytes, "audio/wav")}
|
||||
data = {"response_format": "json"}
|
||||
|
||||
if lang_code:
|
||||
data["language"] = lang_code
|
||||
|
||||
# Choose endpoint based on task
|
||||
if task == "Translate to English":
|
||||
endpoint = f"{STT_URL}/v1/audio/translations"
|
||||
else:
|
||||
endpoint = f"{STT_URL}/v1/audio/transcriptions"
|
||||
|
||||
# Send request
|
||||
response = client.post(endpoint, files=files, data=data)
|
||||
response.raise_for_status()
|
||||
|
||||
latency = time.time() - start_time
|
||||
result = response.json()
|
||||
|
||||
text = result.get("text", "")
|
||||
detected_language = result.get("language", "unknown")
|
||||
|
||||
# Status message
|
||||
status = f"✅ Transcribed {audio_duration:.1f}s of audio in {latency*1000:.0f}ms"
|
||||
|
||||
# Metrics
|
||||
metrics = f"""
|
||||
**Transcription Statistics:**
|
||||
- Audio Duration: {audio_duration:.2f} seconds
|
||||
- Processing Time: {latency*1000:.0f}ms
|
||||
- Real-time Factor: {latency/audio_duration:.2f}x
|
||||
- Detected Language: {detected_language}
|
||||
- Task: {task}
|
||||
- Word Count: {len(text.split())}
|
||||
- Character Count: {len(text)}
|
||||
"""
|
||||
|
||||
return status, text, metrics
|
||||
|
||||
except httpx.HTTPStatusError as e:
|
||||
logger.exception("STT request failed")
|
||||
return f"❌ STT service error: {e.response.status_code}", "", ""
|
||||
except Exception as e:
|
||||
logger.exception("Transcription failed")
|
||||
return f"❌ Error: {str(e)}", "", ""
|
||||
|
||||
|
||||
def check_service_health() -> str:
|
||||
"""Check if the STT service is healthy."""
|
||||
try:
|
||||
response = client.get(f"{STT_URL}/health", timeout=5.0)
|
||||
if response.status_code == 200:
|
||||
return "🟢 Service is healthy"
|
||||
|
||||
# Try v1/models endpoint (OpenAI-compatible)
|
||||
response = client.get(f"{STT_URL}/v1/models", timeout=5.0)
|
||||
if response.status_code == 200:
|
||||
return "🟢 Service is healthy"
|
||||
|
||||
return f"🟡 Service returned status {response.status_code}"
|
||||
except Exception as e:
|
||||
return f"🔴 Service unavailable: {str(e)}"
|
||||
|
||||
|
||||
# Build the Gradio app
|
||||
with gr.Blocks(theme=get_lab_theme(), css=CUSTOM_CSS, title="STT Demo") as demo:
|
||||
gr.Markdown("""
|
||||
# 🎙️ Speech-to-Text Demo
|
||||
|
||||
Test the **Whisper** speech-to-text service. Transcribe audio from microphone
|
||||
or file upload with support for 100+ languages.
|
||||
""")
|
||||
|
||||
# Service status
|
||||
with gr.Row():
|
||||
health_btn = gr.Button("🔄 Check Service", size="sm")
|
||||
health_status = gr.Textbox(label="Service Status", interactive=False)
|
||||
|
||||
health_btn.click(fn=check_service_health, outputs=health_status)
|
||||
|
||||
with gr.Tabs():
|
||||
# Tab 1: Microphone Input
|
||||
with gr.TabItem("🎤 Microphone"):
|
||||
with gr.Row():
|
||||
with gr.Column():
|
||||
mic_input = gr.Audio(
|
||||
label="Record Audio",
|
||||
sources=["microphone"],
|
||||
type="numpy"
|
||||
)
|
||||
|
||||
with gr.Row():
|
||||
mic_language = gr.Dropdown(
|
||||
choices=list(LANGUAGES.keys()),
|
||||
value="Auto-detect",
|
||||
label="Language"
|
||||
)
|
||||
mic_task = gr.Radio(
|
||||
choices=["Transcribe", "Translate to English"],
|
||||
value="Transcribe",
|
||||
label="Task"
|
||||
)
|
||||
|
||||
mic_btn = gr.Button("🎯 Transcribe", variant="primary")
|
||||
|
||||
with gr.Column():
|
||||
mic_status = gr.Textbox(label="Status", interactive=False)
|
||||
mic_metrics = gr.Markdown(label="Metrics")
|
||||
|
||||
mic_output = gr.Textbox(
|
||||
label="Transcription",
|
||||
lines=5
|
||||
)
|
||||
|
||||
mic_btn.click(
|
||||
fn=transcribe_audio,
|
||||
inputs=[mic_input, mic_language, mic_task],
|
||||
outputs=[mic_status, mic_output, mic_metrics]
|
||||
)
|
||||
|
||||
# Tab 2: File Upload
|
||||
with gr.TabItem("📁 File Upload"):
|
||||
with gr.Row():
|
||||
with gr.Column():
|
||||
file_input = gr.Audio(
|
||||
label="Upload Audio File",
|
||||
sources=["upload"],
|
||||
type="filepath"
|
||||
)
|
||||
|
||||
with gr.Row():
|
||||
file_language = gr.Dropdown(
|
||||
choices=list(LANGUAGES.keys()),
|
||||
value="Auto-detect",
|
||||
label="Language"
|
||||
)
|
||||
file_task = gr.Radio(
|
||||
choices=["Transcribe", "Translate to English"],
|
||||
value="Transcribe",
|
||||
label="Task"
|
||||
)
|
||||
|
||||
file_btn = gr.Button("🎯 Transcribe", variant="primary")
|
||||
|
||||
with gr.Column():
|
||||
file_status = gr.Textbox(label="Status", interactive=False)
|
||||
file_metrics = gr.Markdown(label="Metrics")
|
||||
|
||||
file_output = gr.Textbox(
|
||||
label="Transcription",
|
||||
lines=5
|
||||
)
|
||||
|
||||
file_btn.click(
|
||||
fn=transcribe_audio,
|
||||
inputs=[file_input, file_language, file_task],
|
||||
outputs=[file_status, file_output, file_metrics]
|
||||
)
|
||||
|
||||
gr.Markdown("""
|
||||
**Supported formats:** WAV, MP3, FLAC, OGG, M4A, WEBM
|
||||
|
||||
*For best results, use clear audio with minimal background noise.*
|
||||
""")
|
||||
|
||||
# Tab 3: Translation
|
||||
with gr.TabItem("🌍 Translation"):
|
||||
gr.Markdown("""
|
||||
### Speech Translation
|
||||
|
||||
Upload or record audio in any language and get English translation.
|
||||
Whisper will automatically detect the source language.
|
||||
""")
|
||||
|
||||
with gr.Row():
|
||||
with gr.Column():
|
||||
trans_input = gr.Audio(
|
||||
label="Audio Input",
|
||||
sources=["microphone", "upload"],
|
||||
type="numpy"
|
||||
)
|
||||
trans_btn = gr.Button("🌍 Translate to English", variant="primary")
|
||||
|
||||
with gr.Column():
|
||||
trans_status = gr.Textbox(label="Status", interactive=False)
|
||||
trans_metrics = gr.Markdown(label="Metrics")
|
||||
|
||||
trans_output = gr.Textbox(
|
||||
label="English Translation",
|
||||
lines=5
|
||||
)
|
||||
|
||||
def translate_audio(audio):
|
||||
return transcribe_audio(audio, "Auto-detect", "Translate to English")
|
||||
|
||||
trans_btn.click(
|
||||
fn=translate_audio,
|
||||
inputs=trans_input,
|
||||
outputs=[trans_status, trans_output, trans_metrics]
|
||||
)
|
||||
|
||||
create_footer()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
demo.launch(
|
||||
server_name="0.0.0.0",
|
||||
server_port=7860,
|
||||
show_error=True
|
||||
)
|
||||
95
stt.yaml
Normal file
95
stt.yaml
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: stt-ui
|
||||
namespace: ai-ml
|
||||
labels:
|
||||
app: stt-ui
|
||||
component: demo-ui
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: stt-ui
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: stt-ui
|
||||
component: demo-ui
|
||||
spec:
|
||||
containers:
|
||||
- name: gradio
|
||||
image: ghcr.io/billy-davies-2/llm-apps:v2-202601271655
|
||||
imagePullPolicy: Always
|
||||
command: ["python", "stt.py"]
|
||||
ports:
|
||||
- containerPort: 7860
|
||||
name: http
|
||||
protocol: TCP
|
||||
env:
|
||||
- name: WHISPER_URL
|
||||
value: "http://whisper-predictor.ai-ml.svc.cluster.local"
|
||||
- name: MLFLOW_TRACKING_URI
|
||||
value: "http://mlflow.mlflow.svc.cluster.local:80"
|
||||
resources:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "512Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 7860
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 30
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 7860
|
||||
initialDelaySeconds: 5
|
||||
periodSeconds: 10
|
||||
imagePullSecrets:
|
||||
- name: ghcr-registry
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: stt-ui
|
||||
namespace: ai-ml
|
||||
labels:
|
||||
app: stt-ui
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 7860
|
||||
protocol: TCP
|
||||
name: http
|
||||
selector:
|
||||
app: stt-ui
|
||||
---
|
||||
apiVersion: gateway.networking.k8s.io/v1
|
||||
kind: HTTPRoute
|
||||
metadata:
|
||||
name: stt-ui
|
||||
namespace: ai-ml
|
||||
annotations:
|
||||
external-dns.alpha.kubernetes.io/hostname: stt-ui.lab.daviestechlabs.io
|
||||
spec:
|
||||
parentRefs:
|
||||
- name: envoy-internal
|
||||
namespace: network
|
||||
sectionName: https
|
||||
hostnames:
|
||||
- stt-ui.lab.daviestechlabs.io
|
||||
rules:
|
||||
- matches:
|
||||
- path:
|
||||
type: PathPrefix
|
||||
value: /
|
||||
backendRefs:
|
||||
- name: stt-ui
|
||||
port: 80
|
||||
382
theme.py
Normal file
382
theme.py
Normal file
@@ -0,0 +1,382 @@
|
||||
"""
|
||||
Shared Gradio theme for Davies Tech Labs AI demos.
|
||||
Consistent styling across all demo applications.
|
||||
Cyberpunk aesthetic - dark with yellow/gold accents.
|
||||
"""
|
||||
import gradio as gr
|
||||
|
||||
|
||||
# Cyberpunk color palette
|
||||
CYBER_YELLOW = "#d4a700"
|
||||
CYBER_GOLD = "#ffcc00"
|
||||
CYBER_DARK = "#0d0d0d"
|
||||
CYBER_DARKER = "#080808"
|
||||
CYBER_GRAY = "#1a1a1a"
|
||||
CYBER_TEXT = "#e5e5e5"
|
||||
CYBER_MUTED = "#888888"
|
||||
|
||||
|
||||
def get_lab_theme() -> gr.Theme:
|
||||
"""
|
||||
Create a custom Gradio theme matching cyberpunk styling.
|
||||
Dark theme with yellow/gold accents.
|
||||
"""
|
||||
return gr.themes.Base(
|
||||
primary_hue=gr.themes.colors.yellow,
|
||||
secondary_hue=gr.themes.colors.amber,
|
||||
neutral_hue=gr.themes.colors.zinc,
|
||||
font=[gr.themes.GoogleFont("Space Grotesk"), "ui-sans-serif", "system-ui", "sans-serif"],
|
||||
font_mono=[gr.themes.GoogleFont("JetBrains Mono"), "ui-monospace", "monospace"],
|
||||
).set(
|
||||
# Background colors
|
||||
body_background_fill=CYBER_DARK,
|
||||
body_background_fill_dark=CYBER_DARKER,
|
||||
background_fill_primary=CYBER_GRAY,
|
||||
background_fill_primary_dark=CYBER_DARK,
|
||||
background_fill_secondary=CYBER_DARKER,
|
||||
background_fill_secondary_dark="#050505",
|
||||
# Text colors
|
||||
body_text_color=CYBER_TEXT,
|
||||
body_text_color_dark=CYBER_TEXT,
|
||||
body_text_color_subdued=CYBER_MUTED,
|
||||
body_text_color_subdued_dark=CYBER_MUTED,
|
||||
# Borders
|
||||
border_color_primary=CYBER_YELLOW,
|
||||
border_color_primary_dark=CYBER_YELLOW,
|
||||
border_color_accent=CYBER_GOLD,
|
||||
border_color_accent_dark=CYBER_GOLD,
|
||||
# Buttons
|
||||
button_primary_background_fill=CYBER_YELLOW,
|
||||
button_primary_background_fill_dark=CYBER_YELLOW,
|
||||
button_primary_background_fill_hover="#b8940a",
|
||||
button_primary_background_fill_hover_dark="#b8940a",
|
||||
button_primary_text_color=CYBER_DARK,
|
||||
button_primary_text_color_dark=CYBER_DARK,
|
||||
button_primary_border_color=CYBER_GOLD,
|
||||
button_primary_border_color_dark=CYBER_GOLD,
|
||||
button_secondary_background_fill="transparent",
|
||||
button_secondary_background_fill_dark="transparent",
|
||||
button_secondary_text_color=CYBER_YELLOW,
|
||||
button_secondary_text_color_dark=CYBER_YELLOW,
|
||||
button_secondary_border_color=CYBER_YELLOW,
|
||||
button_secondary_border_color_dark=CYBER_YELLOW,
|
||||
# Inputs
|
||||
input_background_fill=CYBER_DARKER,
|
||||
input_background_fill_dark=CYBER_DARKER,
|
||||
input_border_color="#333333",
|
||||
input_border_color_dark="#333333",
|
||||
input_border_color_focus=CYBER_YELLOW,
|
||||
input_border_color_focus_dark=CYBER_YELLOW,
|
||||
# Shadows and effects
|
||||
shadow_drop="0 4px 20px rgba(212, 167, 0, 0.15)",
|
||||
shadow_drop_lg="0 8px 40px rgba(212, 167, 0, 0.2)",
|
||||
# Block styling
|
||||
block_background_fill=CYBER_GRAY,
|
||||
block_background_fill_dark=CYBER_GRAY,
|
||||
block_border_color="#2a2a2a",
|
||||
block_border_color_dark="#2a2a2a",
|
||||
block_label_text_color=CYBER_YELLOW,
|
||||
block_label_text_color_dark=CYBER_YELLOW,
|
||||
block_title_text_color=CYBER_TEXT,
|
||||
block_title_text_color_dark=CYBER_TEXT,
|
||||
)
|
||||
|
||||
|
||||
# Common CSS for all demos - Cyberpunk theme
|
||||
CUSTOM_CSS = """
|
||||
/* Cyberpunk font import */
|
||||
@import url('https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@400;500;700&family=JetBrains+Mono:wght@400;500&display=swap');
|
||||
|
||||
/* Root variables */
|
||||
:root {
|
||||
--cyber-yellow: #d4a700;
|
||||
--cyber-gold: #ffcc00;
|
||||
--cyber-dark: #0d0d0d;
|
||||
--cyber-gray: #1a1a1a;
|
||||
--cyber-text: #e5e5e5;
|
||||
}
|
||||
|
||||
/* Container styling */
|
||||
.gradio-container {
|
||||
max-width: 1400px !important;
|
||||
margin: auto !important;
|
||||
background: var(--cyber-dark) !important;
|
||||
}
|
||||
|
||||
/* Header/title styling - glitch effect */
|
||||
.title-row, h1 {
|
||||
color: var(--cyber-text) !important;
|
||||
font-family: 'Space Grotesk', sans-serif !important;
|
||||
font-weight: 700 !important;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.15em;
|
||||
position: relative;
|
||||
}
|
||||
|
||||
h1::after {
|
||||
content: '';
|
||||
position: absolute;
|
||||
bottom: -8px;
|
||||
left: 0;
|
||||
width: 100%;
|
||||
height: 2px;
|
||||
background: linear-gradient(90deg, var(--cyber-yellow), transparent);
|
||||
}
|
||||
|
||||
/* Yellow accent lines - horizontal separator */
|
||||
.cyber-line {
|
||||
width: 100%;
|
||||
height: 2px;
|
||||
background: var(--cyber-yellow);
|
||||
margin: 1.5rem 0;
|
||||
box-shadow: 0 0 10px var(--cyber-yellow);
|
||||
}
|
||||
|
||||
/* Scrolling Japanese text effect */
|
||||
.cyber-marquee {
|
||||
overflow: hidden;
|
||||
background: linear-gradient(90deg, var(--cyber-dark), transparent 5%, transparent 95%, var(--cyber-dark));
|
||||
padding: 0.5rem 0;
|
||||
border-top: 1px solid var(--cyber-yellow);
|
||||
border-bottom: 1px solid var(--cyber-yellow);
|
||||
}
|
||||
|
||||
.cyber-marquee-content {
|
||||
display: inline-block;
|
||||
white-space: nowrap;
|
||||
animation: marquee 20s linear infinite;
|
||||
color: var(--cyber-yellow);
|
||||
font-family: 'Space Grotesk', sans-serif;
|
||||
letter-spacing: 0.5em;
|
||||
}
|
||||
|
||||
@keyframes marquee {
|
||||
0% { transform: translateX(0); }
|
||||
100% { transform: translateX(-50%); }
|
||||
}
|
||||
|
||||
/* Status indicators */
|
||||
.status-ok {
|
||||
color: #00ff88 !important;
|
||||
font-weight: 600;
|
||||
text-shadow: 0 0 10px #00ff88;
|
||||
}
|
||||
|
||||
.status-error {
|
||||
color: #ff3366 !important;
|
||||
font-weight: 600;
|
||||
text-shadow: 0 0 10px #ff3366;
|
||||
}
|
||||
|
||||
.status-pending {
|
||||
color: var(--cyber-yellow) !important;
|
||||
font-weight: 600;
|
||||
text-shadow: 0 0 10px var(--cyber-yellow);
|
||||
}
|
||||
|
||||
/* Metrics display - terminal style */
|
||||
.metrics-box {
|
||||
background: rgba(13, 13, 13, 0.9) !important;
|
||||
border: 1px solid var(--cyber-yellow) !important;
|
||||
border-radius: 0 !important;
|
||||
padding: 16px !important;
|
||||
font-family: 'JetBrains Mono', monospace !important;
|
||||
color: var(--cyber-gold) !important;
|
||||
box-shadow: 0 0 20px rgba(212, 167, 0, 0.1);
|
||||
}
|
||||
|
||||
/* Code blocks */
|
||||
.code-block, pre, code {
|
||||
background: #0a0a0a !important;
|
||||
border: 1px solid #333 !important;
|
||||
border-left: 3px solid var(--cyber-yellow) !important;
|
||||
font-family: 'JetBrains Mono', monospace !important;
|
||||
}
|
||||
|
||||
/* Buttons - cyber style */
|
||||
.gr-button-primary {
|
||||
background: var(--cyber-yellow) !important;
|
||||
color: var(--cyber-dark) !important;
|
||||
border: none !important;
|
||||
text-transform: uppercase !important;
|
||||
letter-spacing: 0.1em !important;
|
||||
font-weight: 600 !important;
|
||||
transition: all 0.3s ease !important;
|
||||
clip-path: polygon(0 0, calc(100% - 8px) 0, 100% 8px, 100% 100%, 8px 100%, 0 calc(100% - 8px));
|
||||
}
|
||||
|
||||
.gr-button-primary:hover {
|
||||
background: var(--cyber-gold) !important;
|
||||
box-shadow: 0 0 30px rgba(212, 167, 0, 0.5) !important;
|
||||
transform: translateY(-2px);
|
||||
}
|
||||
|
||||
.gr-button-secondary {
|
||||
background: transparent !important;
|
||||
color: var(--cyber-yellow) !important;
|
||||
border: 1px solid var(--cyber-yellow) !important;
|
||||
text-transform: uppercase !important;
|
||||
letter-spacing: 0.1em !important;
|
||||
}
|
||||
|
||||
/* Input fields */
|
||||
.gr-input, .gr-textbox, textarea, input {
|
||||
background: #0a0a0a !important;
|
||||
border: 1px solid #333 !important;
|
||||
color: var(--cyber-text) !important;
|
||||
border-radius: 0 !important;
|
||||
transition: border-color 0.3s ease !important;
|
||||
}
|
||||
|
||||
.gr-input:focus, .gr-textbox:focus, textarea:focus, input:focus {
|
||||
border-color: var(--cyber-yellow) !important;
|
||||
box-shadow: 0 0 10px rgba(212, 167, 0, 0.3) !important;
|
||||
}
|
||||
|
||||
/* Tabs - angular cyber style */
|
||||
.gr-tab-nav {
|
||||
border-bottom: 2px solid var(--cyber-yellow) !important;
|
||||
}
|
||||
|
||||
.gr-tab {
|
||||
background: transparent !important;
|
||||
color: var(--cyber-muted) !important;
|
||||
border: none !important;
|
||||
text-transform: uppercase !important;
|
||||
letter-spacing: 0.1em !important;
|
||||
}
|
||||
|
||||
.gr-tab.selected {
|
||||
color: var(--cyber-yellow) !important;
|
||||
background: rgba(212, 167, 0, 0.1) !important;
|
||||
}
|
||||
|
||||
/* Accordion */
|
||||
.gr-accordion {
|
||||
border: 1px solid #333 !important;
|
||||
background: var(--cyber-gray) !important;
|
||||
}
|
||||
|
||||
/* Labels and text */
|
||||
label, .gr-label {
|
||||
color: var(--cyber-yellow) !important;
|
||||
text-transform: uppercase !important;
|
||||
font-size: 0.75rem !important;
|
||||
letter-spacing: 0.1em !important;
|
||||
}
|
||||
|
||||
/* Slider styling */
|
||||
.gr-slider {
|
||||
--slider-color: var(--cyber-yellow) !important;
|
||||
}
|
||||
|
||||
/* Footer - cyber style */
|
||||
.footer {
|
||||
text-align: center;
|
||||
color: #666;
|
||||
font-size: 0.8rem;
|
||||
padding: 1.5rem;
|
||||
border-top: 1px solid #333;
|
||||
margin-top: 2rem;
|
||||
font-family: 'JetBrains Mono', monospace;
|
||||
letter-spacing: 0.05em;
|
||||
}
|
||||
|
||||
.footer a {
|
||||
color: var(--cyber-yellow);
|
||||
text-decoration: none;
|
||||
transition: all 0.3s ease;
|
||||
}
|
||||
|
||||
.footer a:hover {
|
||||
text-shadow: 0 0 10px var(--cyber-yellow);
|
||||
}
|
||||
|
||||
/* Cyber badge/tag */
|
||||
.cyber-badge {
|
||||
display: inline-block;
|
||||
padding: 4px 12px;
|
||||
background: transparent;
|
||||
border: 1px solid var(--cyber-yellow);
|
||||
color: var(--cyber-yellow);
|
||||
font-size: 0.7rem;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.1em;
|
||||
font-family: 'JetBrains Mono', monospace;
|
||||
}
|
||||
|
||||
/* Progress bars */
|
||||
.progress-bar {
|
||||
background: #1a1a1a !important;
|
||||
border: 1px solid #333 !important;
|
||||
}
|
||||
|
||||
.progress-bar-fill {
|
||||
background: linear-gradient(90deg, var(--cyber-yellow), var(--cyber-gold)) !important;
|
||||
}
|
||||
|
||||
/* Scrollbar styling */
|
||||
::-webkit-scrollbar {
|
||||
width: 8px;
|
||||
height: 8px;
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-track {
|
||||
background: var(--cyber-dark);
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-thumb {
|
||||
background: #333;
|
||||
border: 1px solid var(--cyber-yellow);
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-thumb:hover {
|
||||
background: #444;
|
||||
}
|
||||
|
||||
/* Glowing text effect utility */
|
||||
.glow-text {
|
||||
text-shadow: 0 0 10px var(--cyber-yellow), 0 0 20px var(--cyber-yellow);
|
||||
}
|
||||
"""
|
||||
|
||||
|
||||
def create_header(title: str, description: str) -> gr.Markdown:
|
||||
"""Create a cyberpunk-style header for demo apps."""
|
||||
# Japanese text for marquee effect
|
||||
jp_text = "サイバー · コマース · フューチャー · "
|
||||
return gr.Markdown(f"""
|
||||
<div style="margin-bottom: 2rem;">
|
||||
<div style="display: flex; justify-content: space-between; align-items: center; margin-bottom: 1rem;">
|
||||
<span class="cyber-badge">STORE</span>
|
||||
<span class="cyber-badge">v2.0</span>
|
||||
<span class="cyber-badge">ONLINE</span>
|
||||
</div>
|
||||
|
||||
<h1 style="font-size: 3rem; margin: 0; letter-spacing: 0.2em;">{title.upper()}</h1>
|
||||
<p style="color: #888; margin-top: 0.5rem; font-family: 'JetBrains Mono', monospace; font-size: 0.9rem;">{description}</p>
|
||||
|
||||
<div class="cyber-line"></div>
|
||||
|
||||
<div class="cyber-marquee">
|
||||
<span class="cyber-marquee-content">{jp_text * 8}</span>
|
||||
</div>
|
||||
</div>
|
||||
""")
|
||||
|
||||
|
||||
def create_footer() -> gr.Markdown:
|
||||
"""Create a cyberpunk-style footer for demo apps."""
|
||||
return gr.Markdown("""
|
||||
<div class="cyber-line"></div>
|
||||
<div class="footer">
|
||||
<span class="cyber-badge" style="margin-right: 1rem;">AR</span>
|
||||
<span style="color: #666;">DAVIES TECH LABS</span>
|
||||
<span style="color: #444; margin: 0 1rem;">·</span>
|
||||
<a href="https://mlflow.lab.daviestechlabs.io" target="_blank">MLFLOW</a>
|
||||
<span style="color: #444; margin: 0 1rem;">·</span>
|
||||
<a href="https://kubeflow.lab.daviestechlabs.io" target="_blank">KUBEFLOW</a>
|
||||
<span style="color: #444; margin: 0 1rem;">·</span>
|
||||
<span style="color: #666;">[ スクロール ]</span>
|
||||
</div>
|
||||
""")
|
||||
272
tts.py
Normal file
272
tts.py
Normal file
@@ -0,0 +1,272 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
TTS Demo - Gradio UI for testing Text-to-Speech service.
|
||||
|
||||
Features:
|
||||
- Text input with language selection
|
||||
- Audio playback of synthesized speech
|
||||
- Voice/speaker selection (when available)
|
||||
- MLflow metrics logging
|
||||
- Multiple TTS backends support (Coqui XTTS, Piper, etc.)
|
||||
"""
|
||||
import os
|
||||
import time
|
||||
import logging
|
||||
import io
|
||||
import base64
|
||||
|
||||
import gradio as gr
|
||||
import httpx
|
||||
import soundfile as sf
|
||||
import numpy as np
|
||||
|
||||
from theme import get_lab_theme, CUSTOM_CSS, create_footer
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger("tts-demo")
|
||||
|
||||
# Configuration
|
||||
TTS_URL = os.environ.get(
|
||||
"TTS_URL",
|
||||
"http://tts-predictor.ai-ml.svc.cluster.local"
|
||||
)
|
||||
MLFLOW_TRACKING_URI = os.environ.get(
|
||||
"MLFLOW_TRACKING_URI",
|
||||
"http://mlflow.mlflow.svc.cluster.local:80"
|
||||
)
|
||||
|
||||
# HTTP client with longer timeout for audio generation
|
||||
client = httpx.Client(timeout=120.0)
|
||||
|
||||
# Supported languages for XTTS
|
||||
LANGUAGES = {
|
||||
"English": "en",
|
||||
"Spanish": "es",
|
||||
"French": "fr",
|
||||
"German": "de",
|
||||
"Italian": "it",
|
||||
"Portuguese": "pt",
|
||||
"Polish": "pl",
|
||||
"Turkish": "tr",
|
||||
"Russian": "ru",
|
||||
"Dutch": "nl",
|
||||
"Czech": "cs",
|
||||
"Arabic": "ar",
|
||||
"Chinese": "zh-cn",
|
||||
"Japanese": "ja",
|
||||
"Korean": "ko",
|
||||
"Hungarian": "hu",
|
||||
}
|
||||
|
||||
|
||||
def synthesize_speech(text: str, language: str) -> tuple[str, tuple[int, np.ndarray] | None, str]:
|
||||
"""Synthesize speech from text using the TTS service."""
|
||||
if not text.strip():
|
||||
return "❌ Please enter some text", None, ""
|
||||
|
||||
lang_code = LANGUAGES.get(language, "en")
|
||||
|
||||
try:
|
||||
start_time = time.time()
|
||||
|
||||
# Call TTS service (Coqui XTTS API format)
|
||||
response = client.get(
|
||||
f"{TTS_URL}/api/tts",
|
||||
params={"text": text, "language_id": lang_code}
|
||||
)
|
||||
response.raise_for_status()
|
||||
|
||||
latency = time.time() - start_time
|
||||
audio_bytes = response.content
|
||||
|
||||
# Parse audio data
|
||||
audio_io = io.BytesIO(audio_bytes)
|
||||
audio_data, sample_rate = sf.read(audio_io)
|
||||
|
||||
# Calculate duration
|
||||
if len(audio_data.shape) == 1:
|
||||
duration = len(audio_data) / sample_rate
|
||||
else:
|
||||
duration = len(audio_data) / sample_rate
|
||||
|
||||
# Status message
|
||||
status = f"✅ Generated {duration:.2f}s of audio in {latency*1000:.0f}ms"
|
||||
|
||||
# Metrics
|
||||
metrics = f"""
|
||||
**Audio Statistics:**
|
||||
- Duration: {duration:.2f} seconds
|
||||
- Sample Rate: {sample_rate} Hz
|
||||
- Size: {len(audio_bytes) / 1024:.1f} KB
|
||||
- Generation Time: {latency*1000:.0f}ms
|
||||
- Real-time Factor: {latency/duration:.2f}x
|
||||
- Language: {language} ({lang_code})
|
||||
- Characters: {len(text)}
|
||||
- Chars/sec: {len(text)/latency:.1f}
|
||||
"""
|
||||
|
||||
return status, (sample_rate, audio_data), metrics
|
||||
|
||||
except httpx.HTTPStatusError as e:
|
||||
logger.exception("TTS request failed")
|
||||
return f"❌ TTS service error: {e.response.status_code}", None, ""
|
||||
except Exception as e:
|
||||
logger.exception("TTS synthesis failed")
|
||||
return f"❌ Error: {str(e)}", None, ""
|
||||
|
||||
|
||||
def check_service_health() -> str:
|
||||
"""Check if the TTS service is healthy."""
|
||||
try:
|
||||
# Try the health endpoint first
|
||||
response = client.get(f"{TTS_URL}/health", timeout=5.0)
|
||||
if response.status_code == 200:
|
||||
return "🟢 Service is healthy"
|
||||
|
||||
# Fall back to root endpoint
|
||||
response = client.get(f"{TTS_URL}/", timeout=5.0)
|
||||
if response.status_code == 200:
|
||||
return "🟢 Service is responding"
|
||||
|
||||
return f"🟡 Service returned status {response.status_code}"
|
||||
except Exception as e:
|
||||
return f"🔴 Service unavailable: {str(e)}"
|
||||
|
||||
|
||||
# Build the Gradio app
|
||||
with gr.Blocks(theme=get_lab_theme(), css=CUSTOM_CSS, title="TTS Demo") as demo:
|
||||
gr.Markdown("""
|
||||
# 🔊 Text-to-Speech Demo
|
||||
|
||||
Test the **Coqui XTTS** text-to-speech service. Convert text to natural-sounding speech
|
||||
in multiple languages.
|
||||
""")
|
||||
|
||||
# Service status
|
||||
with gr.Row():
|
||||
health_btn = gr.Button("🔄 Check Service", size="sm")
|
||||
health_status = gr.Textbox(label="Service Status", interactive=False)
|
||||
|
||||
health_btn.click(fn=check_service_health, outputs=health_status)
|
||||
|
||||
with gr.Tabs():
|
||||
# Tab 1: Basic TTS
|
||||
with gr.TabItem("🎤 Text to Speech"):
|
||||
with gr.Row():
|
||||
with gr.Column(scale=2):
|
||||
text_input = gr.Textbox(
|
||||
label="Text to Synthesize",
|
||||
placeholder="Enter text to convert to speech...",
|
||||
lines=5,
|
||||
max_lines=10
|
||||
)
|
||||
|
||||
with gr.Row():
|
||||
language = gr.Dropdown(
|
||||
choices=list(LANGUAGES.keys()),
|
||||
value="English",
|
||||
label="Language"
|
||||
)
|
||||
synthesize_btn = gr.Button("🔊 Synthesize", variant="primary", scale=2)
|
||||
|
||||
with gr.Column(scale=1):
|
||||
status_output = gr.Textbox(label="Status", interactive=False)
|
||||
metrics_output = gr.Markdown(label="Metrics")
|
||||
|
||||
audio_output = gr.Audio(label="Generated Audio", type="numpy")
|
||||
|
||||
synthesize_btn.click(
|
||||
fn=synthesize_speech,
|
||||
inputs=[text_input, language],
|
||||
outputs=[status_output, audio_output, metrics_output]
|
||||
)
|
||||
|
||||
# Example texts
|
||||
gr.Examples(
|
||||
examples=[
|
||||
["Hello! Welcome to Davies Tech Labs. This is a demonstration of our text-to-speech system.", "English"],
|
||||
["The quick brown fox jumps over the lazy dog. This sentence contains every letter of the alphabet.", "English"],
|
||||
["Bonjour! Bienvenue au laboratoire technique de Davies.", "French"],
|
||||
["Hola! Bienvenido al laboratorio de tecnología.", "Spanish"],
|
||||
["Guten Tag! Willkommen im Techniklabor.", "German"],
|
||||
],
|
||||
inputs=[text_input, language],
|
||||
)
|
||||
|
||||
# Tab 2: Comparison
|
||||
with gr.TabItem("🔄 Language Comparison"):
|
||||
gr.Markdown("Compare the same text in different languages.")
|
||||
|
||||
compare_text = gr.Textbox(
|
||||
label="Text to Compare",
|
||||
value="Hello, how are you today?",
|
||||
lines=2
|
||||
)
|
||||
|
||||
with gr.Row():
|
||||
lang1 = gr.Dropdown(choices=list(LANGUAGES.keys()), value="English", label="Language 1")
|
||||
lang2 = gr.Dropdown(choices=list(LANGUAGES.keys()), value="Spanish", label="Language 2")
|
||||
|
||||
compare_btn = gr.Button("Compare Languages", variant="primary")
|
||||
|
||||
with gr.Row():
|
||||
with gr.Column():
|
||||
gr.Markdown("### Language 1")
|
||||
audio1 = gr.Audio(label="Audio 1", type="numpy")
|
||||
status1 = gr.Textbox(label="Status", interactive=False)
|
||||
|
||||
with gr.Column():
|
||||
gr.Markdown("### Language 2")
|
||||
audio2 = gr.Audio(label="Audio 2", type="numpy")
|
||||
status2 = gr.Textbox(label="Status", interactive=False)
|
||||
|
||||
def compare_languages(text, l1, l2):
|
||||
s1, a1, _ = synthesize_speech(text, l1)
|
||||
s2, a2, _ = synthesize_speech(text, l2)
|
||||
return s1, a1, s2, a2
|
||||
|
||||
compare_btn.click(
|
||||
fn=compare_languages,
|
||||
inputs=[compare_text, lang1, lang2],
|
||||
outputs=[status1, audio1, status2, audio2]
|
||||
)
|
||||
|
||||
# Tab 3: Batch Processing
|
||||
with gr.TabItem("📚 Batch Synthesis"):
|
||||
gr.Markdown("Synthesize multiple texts at once (one per line).")
|
||||
|
||||
batch_input = gr.Textbox(
|
||||
label="Texts (one per line)",
|
||||
placeholder="Enter multiple texts, one per line...",
|
||||
lines=6
|
||||
)
|
||||
batch_lang = gr.Dropdown(
|
||||
choices=list(LANGUAGES.keys()),
|
||||
value="English",
|
||||
label="Language"
|
||||
)
|
||||
batch_btn = gr.Button("Synthesize All", variant="primary")
|
||||
|
||||
batch_status = gr.Textbox(label="Status", interactive=False)
|
||||
batch_audios = gr.Dataset(
|
||||
components=[gr.Audio(type="numpy")],
|
||||
label="Generated Audio Files"
|
||||
)
|
||||
|
||||
# Note: Batch processing would need more complex handling
|
||||
# This is a simplified version
|
||||
gr.Markdown("""
|
||||
*Note: For batch processing of many texts, consider using the API directly
|
||||
or the Kubeflow pipeline for better throughput.*
|
||||
""")
|
||||
|
||||
create_footer()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
demo.launch(
|
||||
server_name="0.0.0.0",
|
||||
server_port=7860,
|
||||
show_error=True
|
||||
)
|
||||
95
tts.yaml
Normal file
95
tts.yaml
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: tts-ui
|
||||
namespace: ai-ml
|
||||
labels:
|
||||
app: tts-ui
|
||||
component: demo-ui
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: tts-ui
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: tts-ui
|
||||
component: demo-ui
|
||||
spec:
|
||||
containers:
|
||||
- name: gradio
|
||||
image: ghcr.io/billy-davies-2/llm-apps:v2-202601271655
|
||||
imagePullPolicy: Always
|
||||
command: ["python", "tts.py"]
|
||||
ports:
|
||||
- containerPort: 7860
|
||||
name: http
|
||||
protocol: TCP
|
||||
env:
|
||||
- name: TTS_URL
|
||||
value: "http://tts-predictor.ai-ml.svc.cluster.local"
|
||||
- name: MLFLOW_TRACKING_URI
|
||||
value: "http://mlflow.mlflow.svc.cluster.local:80"
|
||||
resources:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "512Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 7860
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 30
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 7860
|
||||
initialDelaySeconds: 5
|
||||
periodSeconds: 10
|
||||
imagePullSecrets:
|
||||
- name: ghcr-registry
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: tts-ui
|
||||
namespace: ai-ml
|
||||
labels:
|
||||
app: tts-ui
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 7860
|
||||
protocol: TCP
|
||||
name: http
|
||||
selector:
|
||||
app: tts-ui
|
||||
---
|
||||
apiVersion: gateway.networking.k8s.io/v1
|
||||
kind: HTTPRoute
|
||||
metadata:
|
||||
name: tts-ui
|
||||
namespace: ai-ml
|
||||
annotations:
|
||||
external-dns.alpha.kubernetes.io/hostname: tts-ui.lab.daviestechlabs.io
|
||||
spec:
|
||||
parentRefs:
|
||||
- name: envoy-internal
|
||||
namespace: network
|
||||
sectionName: https
|
||||
hostnames:
|
||||
- tts-ui.lab.daviestechlabs.io
|
||||
rules:
|
||||
- matches:
|
||||
- path:
|
||||
type: PathPrefix
|
||||
value: /
|
||||
backendRefs:
|
||||
- name: tts-ui
|
||||
port: 80
|
||||
Reference in New Issue
Block a user