v5 routing brain

Route inference by latency, power headroom, health, and workload fit.

v5 introduces a reusable routing engine and a simulation API so UmamiEdge can explain why a node was selected before forwarding traffic to a runtime.

SourcedemoSupabase when configured
Selected nodeSingapore Colo Node 01online
Score124.32higher is better
Candidates4healthy or degraded nodes
Explainable routing

Candidate ranking.

Back to operator
NodeStatusScoreLatencyPower headroomReason
Singapore Colo Node 01online124.3212 ms8.4 kWlow-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied
Frankfurt Sovereign Nodeonline123.9618 ms9.2 kWlow-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied
Dar Tower Node 01online116.5821 ms5.6 kWlow-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied
Dubai Enterprise Campus Nodedegraded105.9844 ms3.1 kWlow-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied
API

Routing simulation request.

curl -X POST http://localhost:3000/api/router/simulate \
  -H "Content-Type: application/json" \
  -d '{"model":"umami-swahili-small","estimatedTokens":1400,"maxLatencyMs":160,"minAvailableKw":2}'