Skip to main content
The /v1/embeddings endpoint converts one or more text strings into dense vector representations (embeddings). You can use these vectors to find semantically similar content, build retrieval-augmented generation (RAG) pipelines, cluster documents, or train classifiers. OpenOpen8 is compatible with the OpenAI Embeddings API format, so any OpenAI-compatible embedding client works without modification.

POST /v1/embeddings

Request body

model
string
required
The embedding model to use. For example, text-embedding-3-small, text-embedding-3-large, or text-embedding-ada-002. The available models depend on your configured channels.
input
string | string[]
required
The text to embed. Can be a single string or an array of strings. Each string is embedded independently. Arrays are useful for batch embedding multiple documents in one request.
encoding_format
string
The format of the returned embedding vectors. float returns an array of floating-point numbers; base64 returns a base64-encoded binary string. Defaults to float.
dimensions
integer
The number of dimensions for the output embedding vector. Supported only by certain models (e.g., text-embedding-3-small and text-embedding-3-large). Truncates the embedding to the specified length.
user
string
An optional identifier for the end user making the request. Used for monitoring and abuse detection on the provider side.

Response

object
string
Always "list".
model
string
The model that generated the embeddings.
data
object[]
An array of embedding objects, one per input string.
usage
object
Token usage for the request.

Examples

curl https://openopen8.ai/v1/embeddings \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": [
      "OpenOpen8 is a unified AI gateway.",
      "You can embed multiple strings in one request."
    ],
    "encoding_format": "float"
  }'

Common use cases

  • Semantic search — embed your document corpus and a user query, then rank documents by cosine similarity to the query vector.
  • Retrieval-augmented generation (RAG) — retrieve the most relevant chunks from a knowledge base before passing them to a language model.
  • Clustering — group semantically related documents without labeled training data.
  • Classification — use embedding vectors as features for downstream classifiers.