Skip to main content

Cohere

Cloud embedding provider using the Cohere Embed API. Strong multilingual support and competitive pricing.

TypeCloud
Price🟡 Pay-per-use ($0.10/1M tokens)
Scale*~1M LoC
Default modelembed-english-v3.0
Dimensions1024
URLcohere.com

* Estimated lines of code for initial full indexing within 45 minutes based on default rate limits. Incremental reindexing is fast.

Key Features

  • Multilingual embeddings — 100+ languages in a single model
  • Input type awareness — separate embeddings for documents vs queries
  • Batch API — up to 96 texts per request
  • Light models — 384-dimension variants for resource-constrained setups
  • Built-in rate limiting — automatic retry with exponential backoff
  • 2,000 inputs/min — rate limit applies to total inputs, not requests

Setup

1. Get an API key

Sign up at dashboard.cohere.com and create an API key.

2. Configure TeaRAGs

export EMBEDDING_PROVIDER=cohere
export COHERE_API_KEY=...

Configuration

{
"mcpServers": {
"tea-rags": {
"command": "node",
"args": ["/path/to/tea-rags/build/index.js"],
"env": {
"EMBEDDING_PROVIDER": "cohere",
"COHERE_API_KEY": "..."
}
}
}
}
tip

QDRANT_URL is not needed — Qdrant is built-in and starts automatically. Add it only if using external Qdrant.

Optional variables:

VariableDescriptionDefault
EMBEDDING_MODELCohere model nameembed-english-v3.0
EMBEDDING_DIMENSIONSVector dimensions1024 (auto-detected)
EMBEDDING_TUNE_BATCH_SIZETexts per embedding batch96
EMBEDDING_TUNE_MAX_REQUESTS_PER_MINUTERPM limit for rate limiter100

Available Models

ModelDimensionsNotes
embed-english-v3.01024Default. Best quality for English
embed-multilingual-v3.01024100+ languages
embed-english-light-v3.0384Lightweight, faster
embed-multilingual-light-v3.0384Lightweight, multilingual

Rate Limits

TrialProduction
Embed inputs/min2,0002,000
Max texts/request9696
EmbedJob RPM550

Cohere limits by inputs per minute (total texts across all requests), not by RPM. With 2,000 inputs/min and ~625 tokens per code chunk, the 45-minute throughput is ~90k chunks ≈ ~1M LoC. Contact support@cohere.com for higher limits.

When to Use

  • Codebases with multilingual content (comments, docs, variable names in multiple languages)
  • Teams that need both code search and documentation search across languages
  • Projects where 1024 dimensions is a good balance between quality and storage