LLM inference gateway

One endpoint, four providers. Pura scores task complexity and routes to the best model for the task. You get routing headers, overnight spend visibility, and a Lightning-fundedpaid path once the free tier runs out.


Get an API key


Quick start

The shortest useful path is: mint a key, send one streaming request, then switch to single JSON if your shell script expects one object instead of SSE frames.

shellcreate key
curl -X POST https://api.pura.xyz/api/keys \
  -H "Content-Type: application/json" \
  -d '{"label":"my-agent"}'
shellstreaming response
curl -N https://api.pura.xyz/api/chat \
  -H "Authorization: Bearer pura_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Hello"}]}'
shellsingle JSON response
curl https://api.pura.xyz/api/chat \
  -H "Authorization: Bearer pura_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Hello"}],"stream":false}'

OpenAI SDK drop-in

Existing OpenAI-compatible clients can point at https://api.pura.xyz/apiand keep the same request format.

typescriptopenai-compatible
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.pura.xyz/api",
  apiKey: process.env.PURA_API_KEY,
});

const res = await client.chat.completions.create({
  model: "auto",  // Pura picks the best model
  messages: [{ role: "user", content: "Explain backpressure routing." }],
});

Providers

providermodeltiercost per 1K tokens
Groqllama-3.3-70b-versatilecheap$0.0003
Geminigemini-2.0-flashcheap$0.0005
OpenAIgpt-4omid$0.005
Anthropicclaude-sonnet-4-20250514premium$0.003

Response headers

headerdescription
X-Pura-ProviderProvider that handled the request
X-Pura-ModelModel used
X-Pura-CostEstimated cost in USD
X-Pura-TierComplexity tier (cheap/mid/premium)
X-Pura-Budget-RemainingRemaining daily budget

Cost report

shellreport
curl https://api.pura.xyz/api/report \
  -H "Authorization: Bearer pura_YOUR_KEY"

# Returns JSON:
# {
#   "period": "24h",
#   "totalSpendUsd": 0.042,
#   "requestCount": 127,
#   "averageCostUsd": 0.00033,
#   "perModel": { "groq": { ... }, "openai": { ... } }
# }

Lightning wallet

The first 5,000 requests are free. After that, the gateway returns a Lightning invoice. Pay it, watch the status endpoint flip, then keep sending requests against the same key.

shellfund key
# Create a funding invoice (10,000 sats ~ $4)
curl -X POST https://api.pura.xyz/api/wallet/fund \
  -H "Authorization: Bearer pura_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"amount": 10000}'

# The response includes invoiceUrl and lightningUrl

# Check invoice status
curl "https://api.pura.xyz/api/wallet/status?invoiceId=INV_ID" \
  -H "Authorization: Bearer pura_YOUR_KEY"

# Check balance
curl https://api.pura.xyz/api/wallet/balance \
  -H "Authorization: Bearer pura_YOUR_KEY"