Open
Description
Right now, we only use the queue.sidecar.serving.knative.dev/resourcePercentage
annotation to configure the autoscaling and its value is configured globally per environment.
As described here, we also cannot specify autoscaling policy for predictor (the model) and transformer specifically.
This issue tracks how to enables the user to specify autoscaling configuration for their model.
Metadata
Metadata
Assignees
Labels
No labels