Xinference provides a method for installation in a Kubernetes cluster via Helm
.
You have a fully functional Kubernetes cluster.
Enable GPU support in Kubernetes, refer to here.
Helm
is correctly installed.
Add xinference helm repo.
helm repo add xinference https://xorbitsai.github.io/xinference-helm-charts
Update xinference helm repo indexes and query versions.
helm repo update xinference helm search repo xinference/xinference --devel --versions
Install
helm install xinference xinference/xinference -n xinference --version <helm_charts_version>
The installation method mentioned above sets up a Xinference cluster similar to a single-machine setup, with only one worker and all startup parameters at their default values. However, this is usually not the desired setup.
Below are some common custom installation configurations.
I need to download models from ModelScope
.
helm install xinference xinference/xinference -n xinference --version <helm_charts_version> --set config.model_src="modelscope"
I want to use cpu image of xinference (or use any other version of xinference images).
helm install xinference xinference/xinference -n xinference --version <helm_charts_version> --set config.xinference_image="<xinference_docker_image>"
I want to have 4 Xinference workers, with each worker managing 4 GPUs.
helm install xinference xinference/xinference -n xinference --version <helm_charts_version> --set config.worker_num=4 --set config.gpu_per_worker="4"
The above installation method is based on Helm --set
option. For more complex custom installations, such as multiple workers with shared storage, it is highly recommended to use your own values.yaml
file with Helm -f
option for installation.
The default values.yaml
file is located here. Some examples can be found here.
You can also install Xinference in Kubernetes using the third-party KubeBlocks
. This method is not maintained by Xinference and does not guarantee timely updates or availability. Please refer to the documentation at here.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4