Version: v0.4.0

Odin API Reference

odin.moreh.io/v1alpha1

kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice

Field	Type	Description
`apiVersion`	`string`	APIVersion defines the versioned schema of this representation of an object.
`kind`	`string`	Kind is a string value representing the REST resource this object represents.
`metadata`	`object`	Standard object's metadata.
`spec`	`InferenceServiceSpec`	Specification of the desired behavior of the InferenceService.
`status`	`InferenceServiceStatus`	Most recently observed status of the InferenceService.

kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.spec

Field	Type	Description
`framework`	`string`	Framework specifies the inference engine. Enum: `vllm`, `sglang`.
`inferencePoolRefs`	`[]LocalObjectReference`	InferencePoolRefs is a list of references to InferencePools.
`model`	`ModelSpec`	Model identifies the model to serve.
`parallelism`	`Parallelism`	Parallelism defines the parallelism parameters for distributed inference.
`replicas`	`integer`	Number of replicas for Deployments or LeaderWorkerSets. Default is 1.
`service`	`ServiceSpec`	Service defines configuration for the Kubernetes Service associated with the InferenceService.
`template`	`PodTemplateSpec`	Template describes the pod template for Deployment or LeaderWorkerSet leader.
`templateRefs`	`[]TemplateReference`	TemplateRefs is a list of references to InferenceServiceTemplates.
`workerTemplate`	`PodTemplateSpec`	WorkerTemplate describes the pod template for LeaderWorkerSet workers.

Field	Type	Description
`name`	`string`	Name is the name of the referent. Required.

kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.spec.templateRefs

Field	Type	Description
`group`	`string`	Group is the group of the referent.
`kind`	`string`	Kind is the kind of the referent.
`name`	`string`	Name is the name of the referent. Required.

kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.spec.parallelism

Field	Type	Description
`data`	`integer`	Data parallelism size.
`dataLocal`	`integer`	DataLocal data local parallelism size.
`dataRPCPort`	`integer`	DataRPCPort is the data parallelism RPC port.
`expert`	`boolean`	Expert enables expert parallelism.
`pipeline`	`integer`	Pipeline parallelism size.
`tensor`	`integer`	Tensor parallelism size.

kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.spec.model

Field	Type	Description
`name`	`string`	Name is the model identifier (e.g. HuggingFace model ID). Required.

kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.spec.service

Field	Type	Description
`target`	`string`	Specifies which pods receive the `mif.moreh.io/inferenceservice` label for Service endpoint selection. For LeaderWorkerSet: `leader`, `workers`, `all`, or `auto`. For Deployment: target is ignored. Default: `auto`.

kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservice.status

Field	Type	Description
`replicas`	`integer`	Total number of groups that have been created (updated or not, ready or not).
`updatedReplicas`	`integer`	Number of groups that have been updated (ready or not).
`readyReplicas`	`integer`	Number of groups that are in a ready state (updated or not).
`hpaPodSelector`	`string`	Pod selector string for HPA to identify pods belonging to this InferenceService.
`renderedSpec`	`InferenceServiceSpec`	Fully resolved InferenceServiceSpec after merging templateRefs and template rendering.
`conditions`	`[]Condition`	Conditions represent the latest available observations of an object's state.

kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservicetemplate

Field	Type	Description
`apiVersion`	`string`	APIVersion defines the versioned schema of this representation of an object.
`kind`	`string`	Kind is a string value representing the REST resource this object represents.
`metadata`	`object`	Standard object's metadata.
`spec`	`InferenceServiceTemplateSpec`	Specification of the desired behavior of the InferenceServiceTemplate.

kubectl explain --api-version odin.moreh.io/v1alpha1 inferenceservicetemplate.spec

Field	Type	Description
`framework`	`string`	Framework specifies the inference engine. Enum: `vllm`, `sglang`.
`model`	`ModelSpec`	Model identifies the model to serve.
`parallelism`	`Parallelism`	Parallelism defines the parallelism parameters for distributed inference.
`service`	`ServiceSpec`	Service defines configuration for the Kubernetes Service associated with the InferenceService.
`template`	`PodTemplateSpec`	Template describes the pod template for Deployment or LeaderWorkerSet leader.
`workerTemplate`	`PodTemplateSpec`	WorkerTemplate describes the pod template for LeaderWorkerSet workers.