Route inference by latency, power headroom, health, and workload fit.
v5 introduces a reusable routing engine and a simulation API so UmamiEdge can explain why a node was selected before forwarding traffic to a runtime.
Explainable routing
Back to operatorCandidate ranking.
| Node | Status | Score | Latency | Power headroom | Reason |
|---|---|---|---|---|---|
| Singapore Colo Node 01 | online | 124.32 | 12 ms | 8.4 kW | low-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied |
| Frankfurt Sovereign Node | online | 123.96 | 18 ms | 9.2 kW | low-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied |
| Dar Tower Node 01 | online | 116.58 | 21 ms | 5.6 kW | low-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied |
| Dubai Enterprise Campus Node | degraded | 105.98 | 44 ms | 3.1 kW | low-latency preference applied · latency within routing guardrail · sufficient power headroom · regional model workload preference applied |
API
Routing simulation request.
curl -X POST http://localhost:3000/api/router/simulate \
-H "Content-Type: application/json" \
-d '{"model":"umami-swahili-small","estimatedTokens":1400,"maxLatencyMs":160,"minAvailableKw":2}'