Connect model runtimes to the edge network.
Register OpenAI-compatible runtimes hosted on GPU nodes, then route traffic using the gateway and routing engine. Source: demo.
Registered runtimes
Inference endpoints by node.
| Name | Status | Type | Node | Base URL | Created |
|---|---|---|---|---|---|
| umami-swahili-inference-small | online | openai-compatible | tz-dar-edge-01 | http://127.0.0.1:8080/v1 | 4/28/2026, 11:45:54 AM |
| umami-document-ai-small | degraded | openai-compatible | tz-aru-edge-01 | http://127.0.0.1:8081/v1 | 4/28/2026, 11:45:54 AM |
Register runtime API
Use this route after node provisioning.
POST /api/model-runtimes/register
{
"organizationId": "demo-org",
"nodeId": "<node-id>",
"name": "umami-swahili-inference-small",
"baseUrl": "http://10.0.0.15:8080/v1",
"runtimeType": "openai-compatible",
"status": "online"
}