> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sup.ai/llms.txt
> Use this file to discover all available pages before exploring further.
# Introduction
> Build AI-powered applications with Sup AI — the multi-model consensus engine for superior results.
Sup AI is a multi-model AI platform that delivers superior results through intelligent model orchestration and consensus-driven responses. Our API is fully compatible with OpenAI, making it easy to integrate into your existing applications.
## Why Sup AI?
Get better answers by combining outputs from multiple leading AI models including Claude, GPT-5, Gemini, and more.
Drop-in replacement for OpenAI's API. Use your existing code with minimal changes.
Automatically select the optimal mode for your task — from fast single-model responses to deep multi-model analysis.
Access model reasoning and confidence scores to understand how answers are generated.
## Platform Features
Sup AI handles the hard parts of building production AI applications so you don't have to.
Sup AI automatically detects user location from IP address or accepts explicit coordinates. This enables:
* **Localized responses** — Models understand the user's timezone, region, and local context
* **Internationalization** — Responses adapt to regional preferences, units, and cultural norms
* **Time-aware answers** — Accurate responses to "what time is it?" or "what's the weather?"
Pass `environment.location` with `{ "ip_address": "current" }` for automatic detection, or provide explicit coordinates for precise control.
Never hit context window limits again. Sup AI intelligently manages conversation history:
* **Automatic compaction** — Long conversations are summarized to fit within model limits
* **Smart prioritization** — Recent and relevant context is preserved while older messages are condensed
* **Seamless experience** — No errors, no truncation — just continuous conversation
This means you can build applications with unlimited conversation length without worrying about token limits.
Each mode automatically calibrates thinking effort based on task complexity:
| Mode | Thinking Budget | Use Case |
| --------------- | --------------- | ------------------------------ |
| `fast` | 256 tokens | Quick, obvious answers |
| `thinking` | 2,048 tokens | Standard development work |
| `deep-thinking` | 3,072 tokens | Complex problem decomposition |
| `pro` | 4,096 tokens | Maximum rigor and verification |
Models with native thinking capabilities (Claude, GPT-5, Gemini) use these budgets to reason through problems before responding. The `auto` mode dynamically adjusts thinking effort based on detected complexity.
AI models sometimes generate malformed tool calls with invalid JSON or incorrect parameters. Sup AI automatically repairs these:
* **JSON repair** — Fixes syntax errors, unclosed brackets, and encoding issues
* **Parameter validation** — Ensures required fields are present and correctly typed
* **Graceful recovery** — Instead of failing, broken tool calls are fixed and retried
This dramatically improves reliability for agentic workflows where tool calling is critical.
Transient failures happen — rate limits, network issues, model overload. Sup AI handles them automatically:
* **Intelligent backoff** — Exponential retry with jitter for rate limits
* **Model fallback** — If a model fails, backup models are tried automatically
* **Seamless recovery** — Retries are invisible to your application
You get reliable responses without implementing complex retry logic yourself.
## Quick Start
Get started with Sup AI in under a minute:
Go to [sup.ai/api/keys](https://sup.ai/api/keys) and create an API key.
Use any OpenAI-compatible client with your Sup AI API key:
```python theme={null}
from openai import OpenAI
client = OpenAI(
base_url="https://api.sup.ai/v1/openai",
api_key="YOUR_API_KEY"
)
response = client.chat.completions.create(
model="auto",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
```
Try different modes like `thinking` or `deep-thinking` for more complex tasks.
## Available Modes
| Mode | Models | Description |
| --------------- | ------ | ---------------------------------------------------- |
| `auto` | 1+ | Intelligently selects the optimal mode for your task |
| `fast` | 1 | Instant responses for trivial tasks |
| `thinking` | 3 | Default for most development work |
| `deep-thinking` | 6 | Advanced problem-solving for complex challenges |
| `pro` | 9 | Maximum rigor for high-stakes, mission-critical work |
## Next Steps
Complete reference for the chat completions endpoint, including all request parameters and response types.
Full OpenAPI specification with interactive examples.