OpenAI-compatible gateway v5

Connect apps once, route them to local AI capacity.

The v5 gateway keeps OpenAI-shaped APIs while preparing the runtime proxy layer for local LLM servers, GPU nodes, and edge model runtimes.

Compatible routes

Drop-in API surface.

MethodRoutePurpose
GET/api/v1/modelsModel discovery for compatible clients.
POST/api/v1/chat/completionsOpenAI-style chat completions with route decision logging.
POST/api/v1/embeddingsEmbedding endpoint with usage metering.
POST/api/router/simulateExplainable routing decision before runtime execution.
Example

Chat completions request.

curl -X POST http://localhost:3000/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"umami-swahili-small","messages":[{"role":"user","content":"Explain energy-aware AI edge nodes for Singapore, Tanzania, and Europe."}]}'