POST /v1/rerank

The /v1/rerank endpoint takes a query string and a list of candidate documents, and returns them sorted by relevance to the query. Reranking is typically used as a second-pass filter after a fast initial retrieval step — for example, after a vector similarity search in a RAG pipeline — to improve the quality of the top results passed to a language model. OpenOpen8’s rerank endpoint is compatible with both the Cohere Rerank API format and the Jina Rerank API format.

Request body

model

string

required

The reranking model to use. For example, rerank-english-v3.0 (Cohere) or jina-reranker-v2-base-multilingual (Jina). The available models depend on your configured channels.

query

string

required

The search query to rank the documents against.

documents

string[] | object[]

required

The list of documents to rank. Each element can be a plain string, or an object with a text field containing the document text.

top_n

integer

The number of top-ranked results to return. If omitted, all documents are returned sorted by relevance score.

return_documents

boolean

If true, each result includes the original document text in addition to the index and relevance score. Defaults to false.

max_chunk_per_doc

integer

Maximum number of chunks per document when the provider splits long documents internally.

overlap_tokens

integer

Number of overlapping tokens between chunks when the provider splits long documents.

Response

results

object[]

Ranked list of document results, ordered from most to least relevant.

Show result object properties

index

integer

The zero-based index of this document in the original documents array.

relevance_score

number

A score between 0.0 and 1.0 indicating how relevant this document is to the query. Higher is more relevant.

document

object

The original document, only present when return_documents is true.

Show document properties

text

string

The document text.

usage

object

Token usage for the request.

Show usage properties

prompt_tokens

integer

Number of tokens consumed by the query and documents.

total_tokens

integer

Total tokens processed.

Example

curl https://openopen8.ai/v1/rerank \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rerank-english-v3.0",
    "query": "What is the capital of France?",
    "documents": [
      "Paris is the capital and largest city of France.",
      "The Eiffel Tower is located in Paris.",
      "Berlin is the capital of Germany.",
      "France has a population of over 67 million."
    ],
    "top_n": 2,
    "return_documents": true
  }'

Example response

{
  "results": [
    {
      "index": 0,
      "relevance_score": 0.9921,
      "document": { "text": "Paris is the capital and largest city of France." }
    },
    {
      "index": 1,
      "relevance_score": 0.4103,
      "document": { "text": "The Eiffel Tower is located in Paris." }
    }
  ],
  "usage": {
    "prompt_tokens": 62,
    "total_tokens": 62
  }
}

Supported providers

Provider	Example models
Cohere	`rerank-english-v3.0`, `rerank-multilingual-v3.0`
Jina	`jina-reranker-v2-base-multilingual`, `jina-reranker-v1-base-en`

Overview

Chat & Completions

Media & Multimodal

Other Endpoints

POST /v1/rerank — rank documents by query relevance

POST /v1/rerank

Request body

Response

Example

Example response

Supported providers

Overview

Chat & Completions

Media & Multimodal

Other Endpoints

​POST /v1/rerank

​Request body

​Response

​Example

​Example response

​Supported providers

POST /v1/rerank

Request body

Response

Example

Example response

Supported providers