Skip to content

Getting Started

This guide will help you set up Arandu and run your first pipeline.

  • Python 3.13+
  • FFmpeg (for audio/video processing)
  • uv (recommended) or pip
  • Google Drive credentials (for Drive integration)
  • Ollama or OpenAI API key (for QA/KG pipelines)
  • Docker (for containerized deployment)
Terminal window
# Clone repository
git clone https://github.com/FredDsR/arandu.git
cd arandu
# Install dependencies
uv sync
# Verify installation
uv run arandu --help
Terminal window
# Clone repository
git clone https://github.com/FredDsR/arandu.git
cd arandu
# Install in editable mode
pip install -e .
# Verify installation
arandu --help
Terminal window
# Ubuntu/Debian
sudo apt-get install ffmpeg
# macOS
brew install ffmpeg
# Verify installation
ffmpeg -version
Terminal window
arandu transcribe audio.mp3
Terminal window
arandu info

This shows your hardware configuration (CPU, GPU, memory).

Terminal window
# Use faster turbo model
arandu transcribe audio.mp3 --model-id openai/whisper-large-v3-turbo
# Use quantization for reduced VRAM
arandu transcribe audio.mp3 --quantize
# Force CPU execution
arandu transcribe audio.mp3 --cpu

For processing files from Google Drive:

  1. Get credentials from Google Cloud Console
  2. Enable the Google Drive API
  3. Create OAuth2 credentials and download as credentials.json
  4. Place in project root
Terminal window
# Transcribe from Google Drive
arandu drive-transcribe <file-id> --credentials credentials.json
Terminal window
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model
ollama pull qwen3:14b
# Start Ollama server
ollama serve
Terminal window
export OPENAI_API_KEY=sk-...
TaskGuide
Process multiple filesTranscription Guide
Validate transcriptionsTranscription Validation Guide
Generate QA pairsQA Generation Guide
Build knowledge graphsKG Construction Guide
Evaluate qualityEvaluation Guide
Configure settingsConfiguration Reference
Audio/Video Files
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Transcription│ ──▶ │ QA │ ──▶ │ KG │
│ Pipeline │ │ Generation │ │ Construction │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
└────────────────────┴────────────────────┘
┌──────────────┐
│ Evaluation │
└──────────────┘
Terminal window
pip install -e .
# or
uv sync
Terminal window
sudo apt-get install ffmpeg # Linux
brew install ffmpeg # macOS
Terminal window
# Use quantization
arandu transcribe audio.mp3 --quantize
# Or force CPU
arandu transcribe audio.mp3 --cpu

See also: Transcription | Transcription Validation | Configuration | CLI Reference