📄️ Heimdall scheduler
Heimdall is the component that performs smart routing and scheduling across multiple inference pods, deciding which pod each request should be sent to. It is implemented according to the Kubernetes Gateway API Inference Extension and can operate together with various gateway controllers.
📄️ Odin inference service
Odin is the component that launches individual inference pods at scale. These inference pods run Moreh vLLM by default, but they can also use open-source vLLM or SGLang when needed.