Skip to main content
Version: v0.4.0

Heimdall API Reference

inference.networking.k8s.io/v1​

InferencePool​

kubectl explain --api-version inference.networking.k8s.io/v1 inferencepools
FieldTypeDescription
apiVersionstringAPIVersion defines the versioned schema of this representation of an object.
kindstringKind is a string value representing the REST resource this object represents.
metadataobjectStandard object's metadata.
specInferencePoolSpecSpecification of the desired behavior of the InferencePool.

InferencePoolSpec​

kubectl explain --api-version inference.networking.k8s.io/v1 inferencepools.spec
FieldTypeDescription
endpointPickerRefEndpointPickerRefReference to the EndpointPicker.
selectorLabelSelectorSelects the pods that belong to the inference pool.
targetPorts[]TargetPortList of ports exposed by the inference pool.

EndpointPickerRef​

kubectl explain --api-version inference.networking.k8s.io/v1 inferencepools.spec.endpointPickerRef
FieldTypeDescription
failureModestringFailureMode configures how the parent handles the case when the Endpoint Picker extension is non-responsive. Defaults to "FailClose".
groupstringGroup is the group of the referent API object. Defaults to "".
kindstringKind is the Kubernetes resource kind of the referent. Defaults to "Service".
namestringName is the name of the referent API object. Required.
portPortPort is the port of the Endpoint Picker extension service.

Port​

FieldTypeDescription
numberintegerNumber defines the port number to access the selected model server Pods.

LabelSelector​

FieldTypeDescription
matchLabelsmap[string]stringmatchLabels is a map of {key,value} pairs.

TargetPort​

FieldTypeDescription
numberintegerNumber of the port.

inference.networking.k8s-x.io/v1alpha1​

EndpointPickerConfig​

FieldTypeDescription
dataDataLayerConfigData configures the DataLayer. It is required if the new DataLayer is enabled.
featureGates[]stringFeatureGates is a set of flags that enable various experimental features with the EPP.
plugins[]PluginSpecPlugins is the list of plugins that will be instantiated. See Plugins for more details.
saturationDetectorSaturationDetectorSaturationDetector when present specifies the configuration of the Saturation detector.
schedulingProfiles[]SchedulingProfileSchedulingProfiles is the list of named SchedulingProfiles that will be created.

PluginSpec​

For more details on available plugins, see Plugins.

FieldTypeDescription
namestringName provides a name for plugin entries to reference. If omitted, the value of the Plugin's Type field will be used.
parametersobjectParameters are the set of parameters to be passed to the plugin's factory function.
typestringType specifies the plugin type to be instantiated.

SchedulingProfile​

A SchedulingProfile executes its plugins in order: Filters, Scorers, then Picker. Only plugins that implement the Filter, Scorer, or Picker interface can be referenced here. Profile handlers operate at the top level and are not part of individual profiles.

FieldTypeDescription
namestringName specifies the name of this SchedulingProfile.
plugins[]SchedulingPluginPlugins is the list of plugins for this SchedulingProfile. They are assigned to the appropriate "slots" based on their type.

SchedulingPlugin​

FieldTypeDescription
pluginRefstringReferences a plugin from the top-level plugins list. The plugin must implement Filter, Scorer, or Picker.
weightintegerOptional weight for Scorer plugins, controlling relative influence when aggregating scores. Defaults to 1 if omitted. Ignored for Filters and Pickers.

SaturationDetector​

FieldTypeDescription
kvCacheUtilThresholdfloatKVCacheUtilThreshold defines the KV cache utilization (0.0 to 1.0) above which a pod is considered to have insufficient capacity.
metricsStalenessThresholdstringMetricsStalenessThreshold defines how old a pod's metrics can be.
queueDepthThresholdintegerQueueDepthThreshold defines the backend waiting queue size above which a pod is considered to have insufficient capacity for new requests.

DataLayerConfig​

FieldTypeDescription
sources[]DataLayerSourceSources is the list of sources to define to the DataLayer.

DataLayerSource​

FieldTypeDescription
extractors[]DataLayerExtractorExtractors specifies the list of Plugin instances to be associated with this Source.
pluginRefstringPluginRef specifies a partiular Plugin instance to be associated with this Source.

DataLayerExtractor​

FieldTypeDescription
pluginRefstringPluginRef specifies a partiular Plugin instance to be associated with this Extractor.