Skip to main content

Voyage AI

Cloud embedding provider using the Voyage AI Embeddings API. Specialized models for code search.

TypeCloud
Price🟡 Pay-per-use ($0.12/1M tokens)
Scale*~2.4M LoC
Default modelvoyage-2
Dimensions1024
URLvoyageai.com

* Estimated lines of code for initial full indexing within 45 minutes based on default rate limits (300 RPM). Incremental reindexing is fast.

Key Features​

  • Code-specialized models — voyage-code-2 is trained on source code
  • High throughput — 2,000 RPM, 3M TPM (Tier 1)
  • Batch API — up to 1,000 texts per request (120k–1M token limit per request depending on model)
  • Custom base URL — supports self-hosted or proxy deployments
  • Input type awareness — separate modes for documents vs queries
  • Built-in rate limiting — automatic retry with exponential backoff

Setup​

1. Get an API key​

Sign up at dash.voyageai.com and create an API key.

2. Configure TeaRAGs​

export EMBEDDING_PROVIDER=voyage
export VOYAGE_API_KEY=pa-...

Configuration​

{
"mcpServers": {
"tea-rags": {
"command": "node",
"args": ["/path/to/tea-rags/build/index.js"],
"env": {
"EMBEDDING_PROVIDER": "voyage",
"VOYAGE_API_KEY": "pa-...",
"EMBEDDING_MODEL": "voyage-code-3"
}
}
}
}
tip

QDRANT_URL is not needed — Qdrant is built-in and starts automatically. Add it only if using external Qdrant.

Optional variables:

VariableDescriptionDefault
EMBEDDING_MODELVoyage model namevoyage-2
EMBEDDING_DIMENSIONSVector dimensions1024 (auto-detected)
EMBEDDING_TUNE_BATCH_SIZETexts per embedding batch128
EMBEDDING_BASE_URLCustom API URLhttps://api.voyageai.com/v1
EMBEDDING_TUNE_MAX_REQUESTS_PER_MINUTERPM limit for rate limiter300

Available Models​

ModelDimensionsNotes
voyage-code-31024Recommended for code. Latest code-specialized model
voyage-3-large1024High quality, general purpose
voyage-21024Default. General purpose (legacy)
voyage-large-21536Higher quality, larger vectors (legacy)
voyage-code-21536Code-specialized (legacy)
voyage-lite-02-instruct1024Lightweight, instruction-tuned (legacy)

Rate Limits by Tier​

Voyage limits both RPM and TPM. The throughput bottleneck is typically TPM (tokens per minute). Each request can batch up to 1,000 texts.

Base Limits (voyage-code-3)​

TierMin. spend to unlockRPMTPMScale in 45 min*
Tier 1—2,0003M~2.4M LoC
Tier 2$1004,0006M~4.7M LoC
Tier 3$1,0006,0009M~7.1M LoC

* Based on average code chunk of ~625 tokens.

TPM by Model Family (Tier 1)​

ModelTPMRPMMax tokens/request
voyage-code-3, voyage-3-large3M2,000120k
voyage-4, voyage-3.58M2,000320k
voyage-4-lite, voyage-3.5-lite16M2,0001M

Max texts per request: 1,000

Tier 2 and Tier 3 multiply TPM and RPM by 2x and 3x respectively.

When to Use​

  • Large codebases where code-specific embedding quality matters
  • Teams that want the best code search quality from a cloud provider
  • Projects where voyage-code-3 outperforms general-purpose models on your codebase
  • Setups requiring a custom base URL (proxy, VPN, self-hosted gateway)