Skip to main content

Installation

Default Setup

TeaRAGs is pre-configured for MacBook M1/M2/M3 out of the box. Default settings are optimized for local development with Docker-based Qdrant and Ollama running on the same machine.

No configuration required if you're running:

  • Qdrant on http://localhost:6333
  • Ollama on http://localhost:11434
  • Default embedding model: unclemusclez/jina-embeddings-v2-base-code:latest

Just install, index, and search — defaults work immediately.

Prerequisites & Installation

Quick Start: Everything in Docker

Easiest option — run both Qdrant and Ollama in Docker:

cd ~/yourpath/to/tea-rags

# Start both services (Qdrant + Ollama)
docker-compose up -d

# Pull the default embedding model
docker exec -it $(docker ps -q -f name=ollama) ollama pull unclemusclez/jina-embeddings-v2-base-code:latest

That's it! TeaRAGs will connect to http://localhost:6333 (Qdrant) and http://localhost:11434 (Ollama) automatically.


Qdrant Setup

# Run Qdrant in Docker
docker run -d \
--name qdrant \
-p 6333:6333 \
-v $(pwd)/qdrant_storage:/qdrant/storage \
qdrant/qdrant:latest

With custom memory limit:

docker run -d \
--name qdrant \
-p 6333:6333 \
-v $(pwd)/qdrant_storage:/qdrant/storage \
--memory=4g \
qdrant/qdrant:latest

Option 2: Native Installation

macOS
brew install qdrant
Linux
wget https://github.com/qdrant/qdrant/releases/latest/download/qdrant-x86_64-unknown-linux-gnu.tar.gz
tar -xzf qdrant-x86_64-unknown-linux-gnu.tar.gz
./qdrant

Verify Qdrant

# Check Qdrant is running
curl http://localhost:6333/healthz
# Should return: "healthy"

Ollama Setup

macOS (Apple Silicon)
Recommended: Ollama Desktop App

On Mac with Apple Silicon (M1/M2/M3/M4), install the Ollama desktop app instead of Homebrew. The desktop app automatically uses Metal GPU acceleration, which is significantly faster for embedding generation. Homebrew installs a CLI-only version that may not leverage GPU optimally.

# Recommended: download the .dmg from https://ollama.com/download/mac
# Drag to Applications → launch → GPU acceleration works automatically

# Alternative: Homebrew (CLI-only, less optimal GPU)
brew install ollama
ollama serve
Linux
# Install script
curl -fsSL https://ollama.com/install.sh | sh

# Start Ollama
ollama serve
Windows
# Download installer from https://ollama.com/download
# Run the installer, Ollama starts automatically

Option 2: Docker

CPU-only (all platforms)
docker run -d \
--name ollama \
-p 11434:11434 \
-v ollama_models:/root/.ollama \
ollama/ollama:latest
With GPU (Linux + NVIDIA)
docker run -d \
--name ollama \
--gpus all \
-p 11434:11434 \
-v ollama_models:/root/.ollama \
ollama/ollama:latest
GPU Access in Docker

Docker GPU support requires NVIDIA Container Toolkit on Linux. On macOS, GPU acceleration is not available in Docker — use the native Ollama app for Metal GPU support.

Pull Embedding Model

# Default code-specialized model (768 dimensions)
ollama pull unclemusclez/jina-embeddings-v2-base-code:latest

# Alternative: nomic-embed-text (768 dimensions)
ollama pull nomic-embed-text:latest

# Alternative: mxbai-embed-large (1024 dimensions)
ollama pull mxbai-embed-large:latest

Verify Ollama

# Check Ollama is running
curl http://localhost:11434/api/version

# Test embedding
ollama run unclemusclez/jina-embeddings-v2-base-code:latest "test"

Installation Summary

ComponentRecommendedAlternative
QdrantDockerNative binary
OllamaNative (for GPU)Docker (CPU-only on Mac)

Recommended setup for MacBook:

  • Qdrant: Docker (simple, isolated)
  • Ollama: Native (GPU acceleration)

Recommended setup for Linux:

  • Qdrant: Docker (simple)
  • Ollama: Native or Docker with GPU (both support GPU)

All-in-Docker setup:

  • Best for: Quick testing, CPU-only, remote servers
  • Trade-off: No GPU acceleration on macOS

When to Configure

Configure environment variables when:

  • Using remote Qdrant or remote GPU server for embeddings
  • Switching to cloud providers (OpenAI, Cohere, Voyage AI)
  • Performance tuning for large codebases (1M+ LOC)
  • Enabling git enrichment (authorship, churn, bug-fix rates)

Essential Environment Variables

Set these in your MCP server configuration when deviating from defaults:

Connection URLs

VariableDefaultWhen to change
QDRANT_URLhttp://localhost:6333Qdrant on remote server or custom port
EMBEDDING_BASE_URLhttp://localhost:11434Ollama on remote GPU server or custom port

Example: Remote GPU server

claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js \
-e QDRANT_URL=http://192.168.1.100:6333 \
-e EMBEDDING_BASE_URL=http://192.168.1.100:11434

Embedding Provider

VariableDefaultOptions
EMBEDDING_PROVIDERollamaollama, openai, cohere, voyage
EMBEDDING_MODELunclemusclez/jina-embeddings-v2-base-code:latestProvider-specific model name

Example: OpenAI

claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js \
-e EMBEDDING_PROVIDER=openai \
-e EMBEDDING_MODEL=text-embedding-3-small \
-e OPENAI_API_KEY=sk-...

Git Enrichment

VariableDefaultPurpose
CODE_ENABLE_GIT_METADATAfalseEnable git blame analysis (authorship, churn, task IDs)

Enable trajectory enrichment:

claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js \
-e CODE_ENABLE_GIT_METADATA=true
tip

Git enrichment runs concurrently with embedding and does not increase indexing time. See Git Enrichments for the full list of signals and advanced usage.

Performance Tuning

VariableDefaultWhen to change
EMBEDDING_BATCH_SIZE1024Tune via npm run tune for your hardware
EMBEDDING_CONCURRENCY1Increase for remote GPU (2–4 typical)
QDRANT_UPSERT_BATCH_SIZE100Tune via npm run tune

Auto-tune for your setup:

npm run tune

Generates tuned_environment_variables.env with optimal settings in ~60 seconds.

info

For detailed benchmarks, batch size optimization, and hardware-specific recommendations, see the Performance Tuning Guide.

Configuration Workflow

1. Start with defaults

# Install and run with zero configuration
npm install
npm run build
claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js

2. Configure only what you need

# Remote Qdrant example
claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js \
-e QDRANT_URL=http://gpu-server:6333 \
-e EMBEDDING_BASE_URL=http://gpu-server:11434

3. Performance tune (optional)

# Auto-tune for your hardware
QDRANT_URL=http://gpu-server:6333 \
EMBEDDING_BASE_URL=http://gpu-server:11434 \
npm run tune

# Apply tuned values
claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js \
-e QDRANT_URL=http://gpu-server:6333 \
-e EMBEDDING_BASE_URL=http://gpu-server:11434 \
-e EMBEDDING_BATCH_SIZE=256 \
-e EMBEDDING_CONCURRENCY=4 \
-e QDRANT_UPSERT_BATCH_SIZE=384

Quick Reference: Common Setups

Local MacBook (default)
# No configuration needed — defaults work out of the box
claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js
Remote GPU Server
claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js \
-e QDRANT_URL=http://192.168.1.100:6333 \
-e EMBEDDING_BASE_URL=http://192.168.1.100:11434 \
-e EMBEDDING_CONCURRENCY=4
OpenAI Embeddings
claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js \
-e EMBEDDING_PROVIDER=openai \
-e EMBEDDING_MODEL=text-embedding-3-small \
-e OPENAI_API_KEY=sk-...
Production with Git Enrichment
claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js \
-e CODE_ENABLE_GIT_METADATA=true

Next Steps