Docker
This guide covers how to run the transcription process on your local machine using Docker Compose.
Table of Contents
Section titled “Table of Contents”- Prerequisites
- Initial Setup
- Running Transcription
- Configuration Options
- Managing Containers
- Viewing Logs and Progress
- Troubleshooting
Prerequisites
Section titled “Prerequisites”Required Software
Section titled “Required Software”- Docker (version 20.10+)
- Docker Compose (version 2.0+ or Docker Desktop)
- NVIDIA Container Toolkit (for GPU support)
Install Docker
Section titled “Install Docker”Ubuntu/Debian:
# Install Dockercurl -fsSL https://get.docker.com | shsudo usermod -aG docker $USER
# Log out and back in, then verifydocker --versionmacOS/Windows: Download and install Docker Desktop.
Install NVIDIA Container Toolkit (GPU Support)
Section titled “Install NVIDIA Container Toolkit (GPU Support)”For NVIDIA GPU support on Linux:
# Add NVIDIA repositorydistribution=$(. /etc/os-release;echo $ID$VERSION_ID)curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ sudo tee /etc/apt/sources.list.d/nvidia-docker.list
# Installsudo apt-get updatesudo apt-get install -y nvidia-container-toolkit
# Restart Dockersudo systemctl restart docker
# Verify GPU accessdocker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smiGoogle OAuth Credentials
Section titled “Google OAuth Credentials”Ensure you have valid credentials:
# Verify credentials existls -la credentials.json token.json
# Refresh token if neededarandu infoInitial Setup
Section titled “Initial Setup”1. Clone/Navigate to Project
Section titled “1. Clone/Navigate to Project”cd /path/to/arandu2. Create Environment File
Section titled “2. Create Environment File”cp .env.example .env3. Configure Settings
Section titled “3. Configure Settings”Edit .env to customize your setup:
# Whisper model (adjust based on your GPU VRAM)ARANDU_MODEL_ID=openai/whisper-large-v3
# Number of workers (adjust based on GPU VRAM)# 24GB VRAM: 4 workers# 16GB VRAM: 2-3 workers# 8GB VRAM: 1-2 workersWORKERS=4
# Enable quantization (reduces VRAM usage by ~50%)ARANDU_QUANTIZE=true
# Input catalog fileCATALOG_FILE=catalog.csv4. Verify Input Catalog
Section titled “4. Verify Input Catalog”ls -la input/catalog.csv5. Build Docker Image
Section titled “5. Build Docker Image”docker compose --profile gpu build aranduRunning Transcription
Section titled “Running Transcription”GPU Mode (Recommended)
Section titled “GPU Mode (Recommended)”Run with NVIDIA GPU acceleration:
docker compose --profile gpu up aranduCPU Mode
Section titled “CPU Mode”Run on CPU only (slower but works without GPU):
docker compose --profile cpu up arandu-cpuRun in Background (Detached)
Section titled “Run in Background (Detached)”# GPU modedocker compose --profile gpu up -d arandu
# CPU modedocker compose --profile cpu up -d arandu-cpuRun with Custom Settings
Section titled “Run with Custom Settings”Override settings without editing .env:
# Use more workersWORKERS=6 docker compose --profile gpu up arandu
# Use a different modelARANDU_MODEL_ID=openai/whisper-large-v3 docker compose --profile gpu up arandu
# Use a different catalogCATALOG_FILE=my_subset.csv docker compose --profile gpu up arandu
# Combine multiple overridesWORKERS=2 ARANDU_MODEL_ID=distil-whisper/distil-large-v3 docker compose --profile gpu up aranduConfiguration Options
Section titled “Configuration Options”Environment Variables
Section titled “Environment Variables”| Variable | Default | Description |
|---|---|---|
ARANDU_MODEL_ID | openai/whisper-large-v3 | Whisper model from Hugging Face |
WORKERS | 4 | Number of parallel transcription workers |
ARANDU_QUANTIZE | true | Enable 8-bit quantization (reduces VRAM) |
ARANDU_FORCE_CPU | false | Force CPU execution |
CATALOG_FILE | catalog.csv | Input catalog filename |
INPUT_DIR | ./input | Directory containing catalog |
RESULTS_DIR | ./results | Output directory for results |
CREDENTIALS_DIR | ./ | Directory containing credentials |
HF_CACHE_DIR | ./cache/huggingface | Hugging Face model cache |
Model Selection Guide
Section titled “Model Selection Guide”| Model | VRAM Required | Speed | Accuracy | Best For |
|---|---|---|---|---|
openai/whisper-large-v3 | ~10GB | Slow | Highest | Final production runs |
openai/whisper-large-v3 | ~6GB | Medium | High | Good balance |
distil-whisper/distil-large-v3 | ~3GB | Fast | Good | Quick processing, limited VRAM |
Worker Configuration Guide
Section titled “Worker Configuration Guide”| GPU VRAM | Recommended Workers | With Quantization |
|---|---|---|
| 24GB (RTX 4090) | 3-4 | 5-6 |
| 16GB (RTX 4080) | 2-3 | 3-4 |
| 12GB (RTX 4070) | 1-2 | 2-3 |
| 8GB (RTX 3070) | 1 | 1-2 |
Managing Containers
Section titled “Managing Containers”View Running Containers
Section titled “View Running Containers”docker compose psStop Transcription
Section titled “Stop Transcription”# Graceful stop (allows current file to complete)docker compose stop
# Force stopdocker compose killRemove Containers
Section titled “Remove Containers”docker compose downRebuild After Code Changes
Section titled “Rebuild After Code Changes”docker compose build --no-cacheClean Up Docker Resources
Section titled “Clean Up Docker Resources”# Remove stopped containers and unused imagesdocker system prune
# Remove everything including volumes (careful!)docker system prune -a --volumesViewing Logs and Progress
Section titled “Viewing Logs and Progress”View Live Logs
Section titled “View Live Logs”# Follow logs in real-timedocker compose logs -f arandu
# View last 100 linesdocker compose logs --tail 100 aranduCheck Progress
Section titled “Check Progress”# Count completed transcriptionsls -1 results/*_transcription.json 2>/dev/null | wc -l
# View checkpoint statuscat results/checkpoint.json | python -m json.toolDetailed Progress Script
Section titled “Detailed Progress Script”python -c "import jsonfrom pathlib import Path
checkpoint = Path('results/checkpoint.json')if checkpoint.exists(): with open(checkpoint) as f: cp = json.load(f) completed = len(cp.get('completed_files', [])) failed = len(cp.get('failed_files', {})) total = cp.get('total_files', 'unknown') print(f'Progress: {completed}/{total} completed') print(f'Failed: {failed}') if cp.get('failed_files'): print('Failed files:') for fid, err in cp['failed_files'].items(): print(f' - {fid}: {err[:50]}...')else: print('No checkpoint found - transcription not started')"Troubleshooting
Section titled “Troubleshooting”Docker Build Fails
Section titled “Docker Build Fails”Python package installation errors:
# Clean build cache and retrydocker compose build --no-cacheDisk space issues:
# Check available spacedf -h
# Clean Docker resourcesdocker system prune -aGPU Not Detected
Section titled “GPU Not Detected”Verify NVIDIA runtime:
# Check if GPU is accessibledocker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smiCheck Docker Compose GPU config:
# Verify GPU reservation in docker-compose.ymldocker compose config | grep -A 10 "deploy:"Fall back to CPU mode:
docker compose --profile cpu up arandu-cpuOut of Memory (OOM) Errors
Section titled “Out of Memory (OOM) Errors”Reduce workers:
WORKERS=1 docker compose --profile gpu up aranduEnable quantization:
ARANDU_QUANTIZE=true docker compose --profile gpu up aranduUse smaller model:
ARANDU_MODEL_ID=distil-whisper/distil-large-v3 docker compose --profile gpu up aranduOAuth Token Expired
Section titled “OAuth Token Expired”Error message: RefreshError or authentication failure
Solution:
# Stop the containerdocker compose stop
# Refresh token locally (outside Docker)arandu info
# Restart transcriptiondocker compose --profile gpu up aranduShared Memory Issues
Section titled “Shared Memory Issues”Error: RuntimeError: unable to open shared memory object
Solution: Increase shared memory size in docker-compose.yml:
shm_size: '32gb' # Increase from default 16gbNetwork/Download Timeout
Section titled “Network/Download Timeout”Pre-download models:
# Download model before running transcriptiondocker compose --profile gpu run --rm arandu python -c "from transformers import AutoModelForSpeechSeq2Seq, AutoProcessormodel_id = 'openai/whisper-large-v3'AutoProcessor.from_pretrained(model_id)AutoModelForSpeechSeq2Seq.from_pretrained(model_id)print('Model downloaded successfully')"Resume After Interruption
Section titled “Resume After Interruption”The checkpoint system automatically handles resume. Simply restart:
docker compose --profile gpu up aranduTo start fresh:
rm results/checkpoint.jsondocker compose --profile gpu up aranduQuick Reference
Section titled “Quick Reference”Common Commands
Section titled “Common Commands”# Build imagedocker compose --profile gpu build arandu
# Run with GPUdocker compose --profile gpu up arandu
# Run with CPUdocker compose --profile cpu up arandu-cpu
# Run in backgrounddocker compose --profile gpu up -d arandu
# View logsdocker compose logs -f arandu
# Stopdocker compose stop
# Clean updocker compose downExample: Full Local Workflow
Section titled “Example: Full Local Workflow”# 1. Setupcp .env.example .env# Edit .env as needed
# 2. Builddocker compose --profile gpu build arandu
# 3. Run transcriptiondocker compose --profile gpu up arandu
# 4. Check resultsls results/*_transcription.json | wc -l
# 5. Clean updocker compose downExample: Quick Test Run
Section titled “Example: Quick Test Run”Test with a small subset of files:
# Create a test catalog with 5 fileshead -6 input/catalog.csv > input/test_catalog.csv
# Run testCATALOG_FILE=test_catalog.csv WORKERS=1 docker compose --profile gpu up arandu
# Check resultsls results/