KeyForge

Security infrastructure for the agent era

Four pillars that turn “we gave the bot an API key and hope for the best” into enforceable, verifiable policy.

Unified LLM API

One OpenAI-compatible endpoint for every model and provider.

Point your existing OpenAI SDK at the KeyForge base URL and every model becomes a string parameter — GPT-5.4, Claude Sonnet 4.6, Gemini 2.5 Flash and more, streamed or buffered, with JSON mode and tool calls. No per-provider SDKs, no request-shape translation in your agent code, no vendor lock-in baked into your prompts.

Drop-in /v1/chat/completions compatibility — switching is a base-URL change
Streaming responses with per-request latency, token, and cost metering
Swap models across providers without touching agent code

agent → vk_ key → KeyForge gateway → [OpenAI | Anthropic | Google | …]

Virtual Keys

Agents hold vk_ tokens. Real provider keys stay encrypted in the vault.

A vk_ key has no mathematical relationship to any provider credential. At request time the gateway resolves it to a policy — allowed models, remaining quota, spend cap, rate limit, expiry — and injects the real key server-side for the outbound call. A compromised agent leaks a capped, revocable token, not your provider account. Revocation is instant and surgical: kill one key, every other agent keeps running.

Real keys never appear in agent code, env vars, or logs
Per-key monthly quotas, dollar spend caps, and RPM rate limits
One-click revocation and optional auto-expiry dates

vk_a7f3… → policy { quota: 20k, cap: $50, rpm: 120 } → sk_… (vault, server-side only)

Rate-Limit Resilience

Key pools auto-shuffle on 429s so agents never stall.

Provider rate limits are enforced per key — so a pool of keys is a pool of independent rate-limit budgets against the same capacity. When a provider returns 429, KeyForge rotates the request onto the next healthy key in your pool and replays it: same provider, same model, same behavior. No exponential-backoff stalls, no silent fallback to a different model that breaks step seven of your pipeline.

Automatic rotation on 429 — the agent never sees the error
Pools per provider; cooldown tracking on rate-limited keys
Stays on your chosen model — no cross-provider behavior drift

429 on key #2 → shuffle → key #3 (same provider/model) → 200 OK

Tamper-Evident Audit Chain

HMAC hash-chained logs — cryptographic proof of what your agents did.

Every request writes an audit entry containing the model, tokens, cost, status, and timestamp. The entry is signed with HMAC-SHA256 and chained to the previous entry’s hash. Insert, delete, or edit any record and every hash after it breaks — tampering is mathematically provable, not just “unlikely.” After the 2026 LiteLLM supply-chain compromise, append-only is no longer enough: verify, don’t trust.

HMAC-SHA256 chaining from a genesis hash, per virtual key
One-click chain verification in the dashboard
Shareable signed audit reports with public verification pages

h₁ = HMAC(GENESIS + entry₁) → h₂ = HMAC(h₁ + entry₂) → h₃ = HMAC(h₂ + entry₃) → …