Skip to main content

OpenAI

Cloud embedding provider using the OpenAI Embeddings API. High quality, easy setup, widely adopted.

TypeCloud
Price🟡 Pay-per-use ($0.02/1M tokens)
Scale*~800k–8M LoC (depends on API tier)
Default modeltext-embedding-3-small
Dimensions1536
URLplatform.openai.com

* Estimated lines of code for initial full indexing within 45 minutes. Incremental reindexing is fast. Scale depends on your OpenAI API tier (TPM limits).

Key Features​

  • High embedding quality — state-of-the-art models
  • Batch API — up to 2048 texts per request
  • Flexible dimensions — text-embedding-3-* models support custom dimension reduction
  • Built-in rate limiting — automatic retry with exponential backoff and Retry-After header support
  • 8,191 tokens per input — handles large code chunks without truncation
  • Familiar API — if you already have an OpenAI key, you're ready to go

Setup​

1. Get an API key​

Go to platform.openai.com/api-keys and create a new key.

2. Configure TeaRAGs​

export EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...

Configuration​

{
"mcpServers": {
"tea-rags": {
"command": "node",
"args": ["/path/to/tea-rags/build/index.js"],
"env": {
"EMBEDDING_PROVIDER": "openai",
"OPENAI_API_KEY": "sk-..."
}
}
}
}
tip

QDRANT_URL is not needed — Qdrant is built-in and starts automatically. Add it only if using external Qdrant.

Optional variables:

VariableDescriptionDefault
EMBEDDING_MODELOpenAI model nametext-embedding-3-small
EMBEDDING_DIMENSIONSVector dimensions (supports reduction)1536 (auto-detected)
EMBEDDING_TUNE_BATCH_SIZETexts per embedding batch2048
EMBEDDING_TUNE_MAX_REQUESTS_PER_MINUTERPM limit for rate limiter3500

Available Models​

ModelDimensionsPriceNotes
text-embedding-3-small1536$0.02/1M tokensDefault. Best price/quality ratio
text-embedding-3-large3072$0.13/1M tokensHighest quality
text-embedding-ada-0021536$0.10/1M tokensLegacy, not recommended for new projects

Dimension Reduction​

text-embedding-3-* models support reducing dimensions without retraining. Lower dimensions = smaller vectors, faster search, lower Qdrant storage:

EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSIONS=512 # reduced from 1536

Rate Limits by Tier​

The throughput bottleneck for OpenAI is TPM (tokens per minute), not RPM. Each request can batch up to 2048 texts, so RPM is rarely the limit.

TierMin. spend to unlockRPMTPMScale in 45 min*
Free—500150k~120k LoC
Tier 1$55001M~800k LoC
Tier 2$505001M~800k LoC
Tier 3$1005,0005M~3.9M LoC
Tier 4$2505,0005M~3.9M LoC
Tier 5$1,00010,00010M~7.8M LoC

* Based on average code chunk of ~625 tokens.