Gemini generateContent and streaming via OpenOpen8
Generate content using Google Gemini-compatible endpoints. Pass your OpenOpen8 token as x-goog-api-key or as a query parameter named key.
OpenOpen8 exposes Google Gemini-compatible endpoints at the /v1beta/models/{model} path, matching the format used by the Google AI SDKs and the Gemini REST API. You authenticate with your OpenOpen8 token using the x-goog-api-key header or a key query parameter — no Google credentials needed. OpenOpen8 routes the request to whichever upstream channel is configured for the model you specify in the URL path.
The model name is part of the URL path, not the request body. For example, to use gemini-2.0-flash, send a POST to /v1beta/models/gemini-2.0-flash:generateContent.
An array of content parts. Each part has a text field for text output, or functionCall for tool invocations. Parts with thought: true contain the model’s internal reasoning when thinking is enabled.
The server returns a newline-delimited stream of JSON objects, each representing a partial response chunk. Each chunk has the same structure as a non-streaming response, with partial candidates[].content.parts[].text.
cURL
curl "https://openopen8.ai/v1beta/models/gemini-2.0-flash:generateContent" \ -H "x-goog-api-key: YOUR_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "systemInstruction": { "parts": [{"text": "You are a geography expert."}] }, "contents": [ {"role": "user", "parts": [{"text": "What is the longest river in Africa?"}]}, {"role": "model", "parts": [{"text": "The Nile is the longest river in Africa."}]}, {"role": "user", "parts": [{"text": "How long is it in kilometers?"}]} ] }'
OpenOpen8 supports Gemini thinking models, which perform additional reasoning before generating a response. You have three ways to enable thinking:1. Thinking model suffix — append -thinking to any supported model name:
POST /v1beta/models/gemini-2.5-flash-thinking:generateContentPOST /v1beta/models/gemini-2.5-pro-thinking:generateContent
2. Effort suffix — append -low, -medium, or -high for fine-grained control:
POST /v1beta/models/gemini-2.5-flash-high:generateContent
3. thinkingConfig in generationConfig — pass the configuration explicitly:
When thinking is active, parts with "thought": true in the response contain the model’s reasoning. These parts are not shown to end users by default — your application decides whether to display them.
If you only need text output and do not want to process thinking parts, set includeThoughts: false and let the model reason internally without including those tokens in the response body.