Cohere

Cloud embedding provider using the Cohere Embed API. Strong multilingual support and competitive pricing.


Type	Cloud
Price	🟡 Pay-per-use ($0.10/1M tokens)
Scale*	~1M LoC
Default model	`embed-english-v3.0`
Dimensions	1024
URL	cohere.com

* Estimated lines of code for initial full indexing within 45 minutes based on default rate limits. Incremental reindexing is fast.

Key Features

Multilingual embeddings — 100+ languages in a single model
Input type awareness — separate embeddings for documents vs queries
Batch API — up to 96 texts per request
Light models — 384-dimension variants for resource-constrained setups
Built-in rate limiting — automatic retry with exponential backoff
2,000 inputs/min — rate limit applies to total inputs, not requests

Setup

1. Get an API key

2. Configure TeaRAGs

export EMBEDDING_PROVIDER=cohere
export COHERE_API_KEY=...

Configuration

{
  "mcpServers": {
    "tea-rags": {
      "command": "node",
      "args": ["/path/to/tea-rags/build/index.js"],
      "env": {
        "EMBEDDING_PROVIDER": "cohere",
        "COHERE_API_KEY": "..."
      }
    }
  }
}

tip

QDRANT_URL is not needed — Qdrant is built-in and starts automatically. Add it only if using external Qdrant.

Optional variables:

Variable	Description	Default
`EMBEDDING_MODEL`	Cohere model name	`embed-english-v3.0`
`EMBEDDING_DIMENSIONS`	Vector dimensions	`1024` (auto-detected)
`EMBEDDING_TUNE_BATCH_SIZE`	Texts per embedding batch	`96`
`EMBEDDING_TUNE_MAX_REQUESTS_PER_MINUTE`	RPM limit for rate limiter	`100`

Available Models

Model	Dimensions	Notes
`embed-english-v3.0`	1024	Default. Best quality for English
`embed-multilingual-v3.0`	1024	100+ languages
`embed-english-light-v3.0`	384	Lightweight, faster
`embed-multilingual-light-v3.0`	384	Lightweight, multilingual

Rate Limits

	Trial	Production
Embed inputs/min	2,000	2,000
Max texts/request	96	96
EmbedJob RPM	5	50

Cohere limits by inputs per minute (total texts across all requests), not by RPM. With 2,000 inputs/min and ~625 tokens per code chunk, the 45-minute throughput is ~90k chunks ≈ ~1M LoC. Contact support@cohere.com for higher limits.

When to Use

Codebases with multilingual content (comments, docs, variable names in multiple languages)
Teams that need both code search and documentation search across languages
Projects where 1024 dimensions is a good balance between quality and storage

Key Features​

Setup​

1. Get an API key​

2. Configure TeaRAGs​

Configuration​

Available Models​

Rate Limits​

When to Use​