/v1/chatgpt-5qwen-2.5-72bgemini-flashembed-v3priorityllama-3.1-405bo3-minireka-core/v1/embedclaude-sonnet-4.5deepseek-v3 /v1/chatgpt-5qwen-2.5-72bgemini-flashembed-v3priorityllama-3.1-405bo3-minireka-core/v1/embedclaude-sonnet-4.5deepseek-v3

gpt-5qwen-2.5-72bgemini-flashembed-v3priorityllama-3.1-405bo3-minireka-core/v1/embedclaude-sonnet-4.5mixtral-8x22b gpt-5qwen-2.5-72bgemini-flashembed-v3priorityllama-3.1-405bo3-minireka-core/v1/embedclaude-sonnet-4.5mixtral-8x22b

grok-4BYOKbalancednova-profallbackmanagedcost-firstcanary/v1/imagemistral-largevoyage-3jamba-1.5 grok-4BYOKbalancednova-profallbackmanagedcost-firstcanary/v1/imagemistral-largevoyage-3jamba-1.5

/v1/visionclaude-opusgemini-2.5-propplx-sonaropenrouterxAIdeepseek-v3phi-4titan-premiercommand-r+ /v1/visionclaude-opusgemini-2.5-propplx-sonaropenrouterxAIdeepseek-v3phi-4titan-premiercommand-r+

canaryrate-limitfallback/v1/audiomistralopenai-compatiblelatencyroute-v17creditsguardrailstrace canaryrate-limitfallback/v1/audiomistralopenai-compatiblelatencyroute-v17creditsguardrailstrace

gpt-5qwen-2.5-72bgemini-flashembed-v3priorityclaude-4.5deepseek-v3mixtral gpt-5qwen-2.5-72bgemini-flashembed-v3priorityclaude-4.5deepseek-v3mixtral

v17 live THE AI CONTROL PLANE

One API. Every model. Your rules.

Route AI requests across providers, fall back when one breaks, cap spend per key, and trace every call in real time.

Start free → Read the docs ↗

Working API key in 60 seconds · BYOK-first · No credit card

requests today — —

uptime · 90d 99.97% 3 incidents · all routed around

providers 11connected + any OpenAI-compatible endpoint

p50 latency 312ms -12ms this week

LIVE ROUTING FLOW

See every routing decision.

A request enters one API, LatentKit evaluates policy, credits, provider health, key caps, and fallback state, then routes to the best eligible provider.

Live · production p50 284ms last decision just now trace_8f4a...2c

1Request

Incoming POST /v1/chat

keylk_live_...a4f2

appcustomer-api

tokens1,204 in

cap$482 / mo

used$108.42

2Policy

LPolicy core v17

priority cost balanced

fallbacks3 armed

budget82% used

canary12% gpt-5

3Providers

OpenAImanaged credits

318ms

AnthropicBYOK primary

284ms

Geminirate limited

Mistralfallback ready

402ms

route cost -> anthropic latency 284ms cost $0.0014 cache hit -38% guards 3 ok

CONTROL SURFACE

Everything you manage from one console.

LatentKit turns provider sprawl into product controls your team can operate.

Provider connections

Connect BYOK credentials, managed Platform Access, health checks, and provider priority.

Routing policies

Control strategy, fallback depth, response profile, route order, and publish state.

API keys and limits

Issue reveal-once keys with monthly caps, expirations, rotation, and instant revoke.

Platform credits

Separate BYOK provider billing from managed-credit usage, margin rules, and reloads.

Request traces

Inspect candidates, selected route, fallback chain, cost, latency, cache, and request ID.

Playground testing

Test chat, images, and embeddings through the same policy before production traffic.

Guardrails and privacy

Apply prompt controls, trace privacy settings, PII handling, and anomaly visibility.

Team operations

Manage roles, audit events, branded domains, white-label surfaces, and support workflows.

SETUP FLOW

From signup to production traffic.

Start with zero setup

LatentKit prepares the default app, route, key, and quickstart path when Platform Access is enabled.

Choose access mode

Bring your own provider keys or use managed Platform Access where it is available for the route.

Shape the production policy

Set endpoint capability, response profile, fallback limit, provider order, and rollout timing.

Issue and test a key

Create a reveal-once app key with caps and expiration, then test the live policy in Playground.

Ship the base URL

Keep the OpenAI request shape and point production traffic at the LatentKit gateway.

console.latentkit.com

OpenAI

Managed ready

Anthropic

BYOK healthy

Gemini

needs key

Mistral

healthy

Workspace ready

default app

Production API

quickstart

Add BYOK credentials whenever you are ready.

Access and route draftdraft

endpoint: /v1/chat · profile: balanced · max fallbacks: 3

Anthropic / Claude SonnetBYOKeligible

OpenAI / GPT-4omanagedeligible

Gemini / 1.5 ProBYOKdegraded

Policy order

OpenAI #1

Anthropic dragging

Gemini #3

Policy order

Anthropic #1

OpenAI #2

Gemini #3

Saved

Create API key

reveal once

limits

Monthly cap: $500

Expires: 90 days

Playground test: passed

Python / LatentKit SDK

from latentkit import LatentKit

with LatentKit(api_key="USE_YOUR_LATENTKIT_KEY_HERE") as client:
  response = client.chat.create(
    messages=[{"role": "user", "content": "Ship it"}],
    response_profile="balanced",
  )

  print(response["content"])

pip install latentkitPolicy liveTrace streaming

STRATEGY LAB

Compare routing behavior before it ships.

Same policy, different strategy. See how priority, cost, availability, and balanced mode distribute traffic.

Click a strategy to compare traffic behavior

Your app

LatentKit

OpenAI100$0.018 / 420ms

Anthropic0$0.021 / 360ms

Gemini0$0.010 / degraded

Priority mode keeps production simple: one primary route, with the rest ready as fallback.

POLICY PUBLISHING

Edit controls without redeploying your app.

Tune response profile, eligibility, fallback depth, and canary rollout from the console while your app keeps one endpoint.

fastlow latency

balanceddefault depth

deepbest model

Anthropic / Claude SonnetBYOK primary

eligible

OpenAI / GPT-4oManaged credits

eligible

Gemini / 1.5 ProBYOK fallback

degraded

Preview

Publish v17 with priority mode, balanced response profile, OpenAI as managed fallback, and Gemini excluded while degraded.

LIVE OBSERVABILITY

Watch every request.

Logs, route candidates, fallback attempts, cost, latency, cache, guardrails, and request IDs.

Request trace

candidateAnthropic

selectedClaude Sonnet

fallbackarmed

Request IDreq_91af

Selected first healthy BYOK route. No guardrail match. Stored with normal trace metadata.

DEVELOPER EXPERIENCE

Drop-in OpenAI shape.

Use the official Python and JavaScript SDKs, cURL, fetch, or anything that speaks OpenAI-compatible requests.

/v1/chat /v1/vision /v1/embed /v1/image /v1/queue

from latentkit import LatentKit

with LatentKit(api_key="USE_YOUR_LATENTKIT_KEY_HERE") as client:
  response = client.chat.create(
    messages=[{"role": "user", "content": "Route this"}],
    response_profile="balanced",
  )

  print(response["content"])

import { LatentKit } from "@latentkit/sdk";

const client = new LatentKit({
  apiKey: process.env.LATENTKIT_KEY!,
});

const response = await client.chat.create({
  messages: [{ role: "user", content: "Route this" }],
  response_profile: "balanced",
});

console.log(response.content);

curl https://ai.latentkit.com/v1/chat \
  -H "Authorization: Bearer USE_YOUR_LATENTKIT_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "messages":[{"role":"user","content":"Route this"}],
    "response_profile":"balanced"
  }'

ENTERPRISE OPERATIONS

Run it under your brand.

Custom domains, white-label dashboard, roles, audit logs, guardrails, and support options.

domainai.yourcompany.com

rolesadmin / operator / developer

auditpolicy publish, key rotation

identitySSO roadmap / inquiry

AAcme AI Gatewaydashboard.acme.com

Spend today$184.20

Fallback events12

Guardrail matches3

Policy v12 scheduled canary at 12%

Key lk_live_91af rotated by Tenant Admin

Custom domain certificate healthy

PRICING

Start free. Scale by policy.

Compare plans →

Free

$0

Unlimited BYOK under fair use, no LatentKit usage markup, one app, one policy, and a managed trial path.

Starter

$9/mo

5 apps, 8 provider connections, 10 policies, 5 members, a $5 activation credit, and Platform Access at provider cost + 5%.

Company

Custom

Custom client workspaces, domains, branding, audit/export, governance, limits, and support.

WHY LATENTKIT

Stop maintaining the same AI gateway twice.

Roll your own

Flexible, but every fallback, budget, trace, and provider change becomes your infrastructure.

Single-vendor proxy

Useful for one provider family, weaker when your app needs policy control across many providers.

LatentKit

One API layer for routing, provider connections, budgets, traces, credits, and team operations.

Ship AI traffic you can trust.

Connect a provider, publish a policy, ship in under 15 minutes.

Start free → Talk to sales →