Skip to content

Docker

This guide covers how to run the transcription process on your local machine using Docker Compose.

  1. Prerequisites
  2. Initial Setup
  3. Running Transcription
  4. Configuration Options
  5. Managing Containers
  6. Viewing Logs and Progress
  7. Troubleshooting

  • Docker (version 20.10+)
  • Docker Compose (version 2.0+ or Docker Desktop)
  • NVIDIA Container Toolkit (for GPU support)

Ubuntu/Debian:

Terminal window
# Install Docker
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
# Log out and back in, then verify
docker --version

macOS/Windows: Download and install Docker Desktop.

Install NVIDIA Container Toolkit (GPU Support)

Section titled “Install NVIDIA Container Toolkit (GPU Support)”

For NVIDIA GPU support on Linux:

Terminal window
# Add NVIDIA repository
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
# Install
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
# Restart Docker
sudo systemctl restart docker
# Verify GPU access
docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi

Ensure you have valid credentials:

Terminal window
# Verify credentials exist
ls -la credentials.json token.json
# Refresh token if needed
arandu info

Terminal window
cd /path/to/arandu
Terminal window
cp .env.example .env

Edit .env to customize your setup:

Terminal window
# Whisper model (adjust based on your GPU VRAM)
ARANDU_MODEL_ID=openai/whisper-large-v3
# Number of workers (adjust based on GPU VRAM)
# 24GB VRAM: 4 workers
# 16GB VRAM: 2-3 workers
# 8GB VRAM: 1-2 workers
WORKERS=4
# Enable quantization (reduces VRAM usage by ~50%)
ARANDU_QUANTIZE=true
# Input catalog file
CATALOG_FILE=catalog.csv
Terminal window
ls -la input/catalog.csv
Terminal window
docker compose --profile gpu build arandu

Run with NVIDIA GPU acceleration:

Terminal window
docker compose --profile gpu up arandu

Run on CPU only (slower but works without GPU):

Terminal window
docker compose --profile cpu up arandu-cpu
Terminal window
# GPU mode
docker compose --profile gpu up -d arandu
# CPU mode
docker compose --profile cpu up -d arandu-cpu

Override settings without editing .env:

Terminal window
# Use more workers
WORKERS=6 docker compose --profile gpu up arandu
# Use a different model
ARANDU_MODEL_ID=openai/whisper-large-v3 docker compose --profile gpu up arandu
# Use a different catalog
CATALOG_FILE=my_subset.csv docker compose --profile gpu up arandu
# Combine multiple overrides
WORKERS=2 ARANDU_MODEL_ID=distil-whisper/distil-large-v3 docker compose --profile gpu up arandu

VariableDefaultDescription
ARANDU_MODEL_IDopenai/whisper-large-v3Whisper model from Hugging Face
WORKERS4Number of parallel transcription workers
ARANDU_QUANTIZEtrueEnable 8-bit quantization (reduces VRAM)
ARANDU_FORCE_CPUfalseForce CPU execution
CATALOG_FILEcatalog.csvInput catalog filename
INPUT_DIR./inputDirectory containing catalog
RESULTS_DIR./resultsOutput directory for results
CREDENTIALS_DIR./Directory containing credentials
HF_CACHE_DIR./cache/huggingfaceHugging Face model cache
ModelVRAM RequiredSpeedAccuracyBest For
openai/whisper-large-v3~10GBSlowHighestFinal production runs
openai/whisper-large-v3~6GBMediumHighGood balance
distil-whisper/distil-large-v3~3GBFastGoodQuick processing, limited VRAM
GPU VRAMRecommended WorkersWith Quantization
24GB (RTX 4090)3-45-6
16GB (RTX 4080)2-33-4
12GB (RTX 4070)1-22-3
8GB (RTX 3070)11-2

Terminal window
docker compose ps
Terminal window
# Graceful stop (allows current file to complete)
docker compose stop
# Force stop
docker compose kill
Terminal window
docker compose down
Terminal window
docker compose build --no-cache
Terminal window
# Remove stopped containers and unused images
docker system prune
# Remove everything including volumes (careful!)
docker system prune -a --volumes

Terminal window
# Follow logs in real-time
docker compose logs -f arandu
# View last 100 lines
docker compose logs --tail 100 arandu
Terminal window
# Count completed transcriptions
ls -1 results/*_transcription.json 2>/dev/null | wc -l
# View checkpoint status
cat results/checkpoint.json | python -m json.tool
Terminal window
python -c "
import json
from pathlib import Path
checkpoint = Path('results/checkpoint.json')
if checkpoint.exists():
with open(checkpoint) as f:
cp = json.load(f)
completed = len(cp.get('completed_files', []))
failed = len(cp.get('failed_files', {}))
total = cp.get('total_files', 'unknown')
print(f'Progress: {completed}/{total} completed')
print(f'Failed: {failed}')
if cp.get('failed_files'):
print('Failed files:')
for fid, err in cp['failed_files'].items():
print(f' - {fid}: {err[:50]}...')
else:
print('No checkpoint found - transcription not started')
"

Python package installation errors:

Terminal window
# Clean build cache and retry
docker compose build --no-cache

Disk space issues:

Terminal window
# Check available space
df -h
# Clean Docker resources
docker system prune -a

Verify NVIDIA runtime:

Terminal window
# Check if GPU is accessible
docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi

Check Docker Compose GPU config:

Terminal window
# Verify GPU reservation in docker-compose.yml
docker compose config | grep -A 10 "deploy:"

Fall back to CPU mode:

Terminal window
docker compose --profile cpu up arandu-cpu

Reduce workers:

Terminal window
WORKERS=1 docker compose --profile gpu up arandu

Enable quantization:

Terminal window
ARANDU_QUANTIZE=true docker compose --profile gpu up arandu

Use smaller model:

Terminal window
ARANDU_MODEL_ID=distil-whisper/distil-large-v3 docker compose --profile gpu up arandu

Error message: RefreshError or authentication failure

Solution:

Terminal window
# Stop the container
docker compose stop
# Refresh token locally (outside Docker)
arandu info
# Restart transcription
docker compose --profile gpu up arandu

Error: RuntimeError: unable to open shared memory object

Solution: Increase shared memory size in docker-compose.yml:

shm_size: '32gb' # Increase from default 16gb

Pre-download models:

Terminal window
# Download model before running transcription
docker compose --profile gpu run --rm arandu python -c "
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor
model_id = 'openai/whisper-large-v3'
AutoProcessor.from_pretrained(model_id)
AutoModelForSpeechSeq2Seq.from_pretrained(model_id)
print('Model downloaded successfully')
"

The checkpoint system automatically handles resume. Simply restart:

Terminal window
docker compose --profile gpu up arandu

To start fresh:

Terminal window
rm results/checkpoint.json
docker compose --profile gpu up arandu

Terminal window
# Build image
docker compose --profile gpu build arandu
# Run with GPU
docker compose --profile gpu up arandu
# Run with CPU
docker compose --profile cpu up arandu-cpu
# Run in background
docker compose --profile gpu up -d arandu
# View logs
docker compose logs -f arandu
# Stop
docker compose stop
# Clean up
docker compose down
Terminal window
# 1. Setup
cp .env.example .env
# Edit .env as needed
# 2. Build
docker compose --profile gpu build arandu
# 3. Run transcription
docker compose --profile gpu up arandu
# 4. Check results
ls results/*_transcription.json | wc -l
# 5. Clean up
docker compose down

Test with a small subset of files:

Terminal window
# Create a test catalog with 5 files
head -6 input/catalog.csv > input/test_catalog.csv
# Run test
CATALOG_FILE=test_catalog.csv WORKERS=1 docker compose --profile gpu up arandu
# Check results
ls results/