LLM inference gateway
One endpoint, four providers. Pura scores task complexity and routes to the best model for the task. You get routing headers, overnight spend visibility, and a Lightning-fundedpaid path once the free tier runs out.
Get an API key
Quick start
The shortest useful path is: mint a key, send one streaming request, then switch to single JSON if your shell script expects one object instead of SSE frames.
curl -X POST https://api.pura.xyz/api/keys \
-H "Content-Type: application/json" \
-d '{"label":"my-agent"}'curl -N https://api.pura.xyz/api/chat \
-H "Authorization: Bearer pura_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"Hello"}]}'curl https://api.pura.xyz/api/chat \
-H "Authorization: Bearer pura_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"Hello"}],"stream":false}'OpenAI SDK drop-in
Existing OpenAI-compatible clients can point at https://api.pura.xyz/apiand keep the same request format.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.pura.xyz/api",
apiKey: process.env.PURA_API_KEY,
});
const res = await client.chat.completions.create({
model: "auto", // Pura picks the best model
messages: [{ role: "user", content: "Explain backpressure routing." }],
});Providers
| provider | model | tier | cost per 1K tokens |
|---|---|---|---|
| Groq | llama-3.3-70b-versatile | cheap | $0.0003 |
| Gemini | gemini-2.0-flash | cheap | $0.0005 |
| OpenAI | gpt-4o | mid | $0.005 |
| Anthropic | claude-sonnet-4-20250514 | premium | $0.003 |
Response headers
| header | description |
|---|---|
| X-Pura-Provider | Provider that handled the request |
| X-Pura-Model | Model used |
| X-Pura-Cost | Estimated cost in USD |
| X-Pura-Tier | Complexity tier (cheap/mid/premium) |
| X-Pura-Budget-Remaining | Remaining daily budget |
Cost report
curl https://api.pura.xyz/api/report \
-H "Authorization: Bearer pura_YOUR_KEY"
# Returns JSON:
# {
# "period": "24h",
# "totalSpendUsd": 0.042,
# "requestCount": 127,
# "averageCostUsd": 0.00033,
# "perModel": { "groq": { ... }, "openai": { ... } }
# }Lightning wallet
The first 5,000 requests are free. After that, the gateway returns a Lightning invoice. Pay it, watch the status endpoint flip, then keep sending requests against the same key.
# Create a funding invoice (10,000 sats ~ $4)
curl -X POST https://api.pura.xyz/api/wallet/fund \
-H "Authorization: Bearer pura_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"amount": 10000}'
# The response includes invoiceUrl and lightningUrl
# Check invoice status
curl "https://api.pura.xyz/api/wallet/status?invoiceId=INV_ID" \
-H "Authorization: Bearer pura_YOUR_KEY"
# Check balance
curl https://api.pura.xyz/api/wallet/balance \
-H "Authorization: Bearer pura_YOUR_KEY"