Odin API Reference
odin.moreh.io/v1alpha1​
InferenceService​
kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice
| Field | Type | Description |
|---|---|---|
apiVersion | string | APIVersion defines the versioned schema of this representation of an object. |
kind | string | Kind is a string value representing the REST resource this object represents. |
metadata | object | Standard object's metadata. |
spec | InferenceServiceSpec | Specification of the desired behavior of the InferenceService. |
status | InferenceServiceStatus | Most recently observed status of the InferenceService. |
InferenceServiceSpec​
kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.spec
| Field | Type | Description |
|---|---|---|
framework | string | Framework specifies the inference engine. Enum: vllm, sglang. |
inferencePoolRefs | []LocalObjectReference | InferencePoolRefs is a list of references to InferencePools. |
model | ModelSpec | Model identifies the model to serve. |
parallelism | Parallelism | Parallelism defines the parallelism parameters for distributed inference. |
replicas | integer | Number of replicas for Deployments or LeaderWorkerSets. Default is 1. |
service | ServiceSpec | Service defines configuration for the Kubernetes Service associated with the InferenceService. |
template | PodTemplateSpec | Template describes the pod template for Deployment or LeaderWorkerSet leader. |
templateRefs | []TemplateReference | TemplateRefs is a list of references to InferenceServiceTemplates. |
workerTemplate | PodTemplateSpec | WorkerTemplate describes the pod template for LeaderWorkerSet workers. |
LocalObjectReference​
| Field | Type | Description |
|---|---|---|
name | string | Name is the name of the referent. Required. |
TemplateReference​
kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.spec.templateRefs
| Field | Type | Description |
|---|---|---|
group | string | Group is the group of the referent. |
kind | string | Kind is the kind of the referent. |
name | string | Name is the name of the referent. Required. |
Parallelism​
kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.spec.parallelism
| Field | Type | Description |
|---|---|---|
data | integer | Data parallelism size. |
dataLocal | integer | DataLocal data local parallelism size. |
dataRPCPort | integer | DataRPCPort is the data parallelism RPC port. |
expert | boolean | Expert enables expert parallelism. |
pipeline | integer | Pipeline parallelism size. |
tensor | integer | Tensor parallelism size. |
ModelSpec​
kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.spec.model
| Field | Type | Description |
|---|---|---|
name | string | Name is the model identifier (e.g. HuggingFace model ID). Required. |
ServiceSpec​
kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.spec.service
| Field | Type | Description |
|---|---|---|
target | string | Specifies which pods receive the mif.moreh.io/inferenceservice label for Service endpoint selection. For LeaderWorkerSet: leader, workers, all, or auto. For Deployment: target is ignored. Default: auto. |
InferenceServiceStatus​
kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.status
| Field | Type | Description |
|---|---|---|
replicas | integer | Total number of groups that have been created (updated or not, ready or not). |
updatedReplicas | integer | Number of groups that have been updated (ready or not). |
readyReplicas | integer | Number of groups that are in a ready state (updated or not). |
hpaPodSelector | string | Pod selector string for HPA to identify pods belonging to this InferenceService. |
renderedSpec | InferenceServiceSpec | Fully resolved InferenceServiceSpec after merging templateRefs and template rendering. |
conditions | []Condition | Conditions represent the latest available observations of an object's state. |
InferenceServiceTemplate​
kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservicetemplate
| Field | Type | Description |
|---|---|---|
apiVersion | string | APIVersion defines the versioned schema of this representation of an object. |
kind | string | Kind is a string value representing the REST resource this object represents. |
metadata | object | Standard object's metadata. |
spec | InferenceServiceTemplateSpec | Specification of the desired behavior of the InferenceServiceTemplate. |
InferenceServiceTemplateSpec​
kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservicetemplate.spec
| Field | Type | Description |
|---|---|---|
framework | string | Framework specifies the inference engine. Enum: vllm, sglang. |
model | ModelSpec | Model identifies the model to serve. |
parallelism | Parallelism | Parallelism defines the parallelism parameters for distributed inference. |
service | ServiceSpec | Service defines configuration for the Kubernetes Service associated with the InferenceService. |
template | PodTemplateSpec | Template describes the pod template for Deployment or LeaderWorkerSet leader. |
workerTemplate | PodTemplateSpec | WorkerTemplate describes the pod template for LeaderWorkerSet workers. |