Deploying models with Lightllm is very simple, requiring only two steps at minimum:
Prepare model weight files supported by Lightllm.
Use command line to start the model service.
(Optional) Test the model service.
Note
Before continuing with this tutorial, please ensure you have completed the Installation Guide.
1. Prepare Model Files#Download Qwen3-8B first. Below is an example code for downloading the model:
(Optional) Create folder
$ mkdirs ~/models && cd ~/models
Install huggingface_hub
$ pip install -U huggingface_hub
Download model files
$ huggingface-cli download Qwen/Qwen3-8B --local-dir Qwen3-8B2. Start Model Service#
After downloading the Qwen3-8B model, use the following code in the terminal to deploy the API service:
$ python -m lightllm.server.api_server --model_dir ~/models/Qwen3-8B
Note
The --model_dir
parameter in the above code needs to be modified to your actual local model path.
$ curl http://127.0.0.1:8000/generate \ -H "Content-Type: application/json" \ -d '{ "inputs": "What is AI?", "parameters":{ "max_new_tokens":17, "frequency_penalty":1 } }'
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4