Heimdall API Reference
inference.networking.k8s.io/v1​
InferencePool​
kubectl explain --api-version inference.networking.k8s.io/v1 inferencepools
| Field | Type | Description |
|---|---|---|
apiVersion | string | APIVersion defines the versioned schema of this representation of an object. |
kind | string | Kind is a string value representing the REST resource this object represents. |
metadata | object | Standard object's metadata. |
spec | InferencePoolSpec | Specification of the desired behavior of the InferencePool. |
InferencePoolSpec​
kubectl explain --api-version inference.networking.k8s.io/v1 inferencepools.spec
| Field | Type | Description |
|---|---|---|
endpointPickerRef | EndpointPickerRef | Reference to the EndpointPicker. |
selector | LabelSelector | Selects the pods that belong to the inference pool. |
targetPorts | []TargetPort | List of ports exposed by the inference pool. |
EndpointPickerRef​
kubectl explain --api-version inference.networking.k8s.io/v1 inferencepools.spec.endpointPickerRef
| Field | Type | Description |
|---|---|---|
failureMode | string | FailureMode configures how the parent handles the case when the Endpoint Picker extension is non-responsive. Defaults to "FailClose". |
group | string | Group is the group of the referent API object. Defaults to "". |
kind | string | Kind is the Kubernetes resource kind of the referent. Defaults to "Service". |
name | string | Name is the name of the referent API object. Required. |
port | Port | Port is the port of the Endpoint Picker extension service. |
Port​
| Field | Type | Description |
|---|---|---|
number | integer | Number defines the port number to access the selected model server Pods. |
LabelSelector​
| Field | Type | Description |
|---|---|---|
matchLabels | map[string]string | matchLabels is a map of {key,value} pairs. |
TargetPort​
| Field | Type | Description |
|---|---|---|
number | integer | Number of the port. |
inference.networking.k8s-x.io/v1alpha1​
EndpointPickerConfig​
| Field | Type | Description |
|---|---|---|
data | DataLayerConfig | Data configures the DataLayer. It is required if the new DataLayer is enabled. |
featureGates | []string | FeatureGates is a set of flags that enable various experimental features with the EPP. |
plugins | []PluginSpec | Plugins is the list of plugins that will be instantiated. See Plugins for more details. |
saturationDetector | SaturationDetector | SaturationDetector when present specifies the configuration of the Saturation detector. |
schedulingProfiles | []SchedulingProfile | SchedulingProfiles is the list of named SchedulingProfiles that will be created. |
PluginSpec​
For more details on available plugins, see Plugins.
| Field | Type | Description |
|---|---|---|
name | string | Name provides a name for plugin entries to reference. If omitted, the value of the Plugin's Type field will be used. |
parameters | object | Parameters are the set of parameters to be passed to the plugin's factory function. |
type | string | Type specifies the plugin type to be instantiated. |
SchedulingProfile​
A SchedulingProfile executes its plugins in order: Filters, Scorers, then Picker. Only plugins that implement the Filter, Scorer, or Picker interface can be referenced here. Profile handlers operate at the top level and are not part of individual profiles.
| Field | Type | Description |
|---|---|---|
name | string | Name specifies the name of this SchedulingProfile. |
plugins | []SchedulingPlugin | Plugins is the list of plugins for this SchedulingProfile. They are assigned to the appropriate "slots" based on their type. |