Routing brain

Route inference by latency, power headroom, health, and workload fit.

The platform introduces a reusable routing engine and a simulation API so UmamiEdge can explain why a node was selected before forwarding traffic to a runtime.

Run simulation API Open gateway

SourcedemoSupabase when configured

Selected nodeSingapore Colo Node 01online

Score124.32higher is better

Candidates4healthy or degraded nodes

Explainable routing

Candidate ranking.

Back to operator

Node	Status	Score	Latency	Power headroom	Reason
Singapore Colo Node 01	online	124.32	12 ms	8.4 kW	low-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied
Frankfurt Sovereign Node	online	123.96	18 ms	9.2 kW	low-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied
Dar Tower Node 01	online	116.58	21 ms	5.6 kW	low-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied
Dubai Enterprise Campus Node	degraded	105.98	44 ms	3.1 kW	low-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied

API

Routing simulation request.

curl -X POST http://localhost:3000/api/router/simulate \
  -H "Content-Type: application/json" \
  -d '{"model":"umami-swahili-small","estimatedTokens":1400,"maxLatencyMs":160,"minAvailableKw":2}'