RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://lightllm-en.readthedocs.io/en/latest/getting_started/installation.html below:

Installation Guide — Lightllm

Installation Guide#

Lightllm is a pure Python-based inference framework with operators written in Triton.

Environment Requirements#

Operating System: Linux
Python: 3.9
GPU: Compute Capability 7.0 or higher (e.g., V100, T4, RTX20xx, A100, L4, H100, etc.)

Installation via Docker#

The easiest way to install Lightllm is using the official image. You can directly pull the official image and run it:

$ # Pull the official image
$ docker pull ghcr.io/modeltc/lightllm:main
$
$ # Run，The current LightLLM service relies heavily on shared memory.
$ # Before starting, please make sure that you have allocated enough shared memory
$ # in your Docker settings; otherwise, the service may fail to start properly.
$ #
$ # 1. For text-only services, it is recommended to allocate more than 2GB of shared memory.
$ # If your system has sufficient RAM, allocating 16GB or more is recommended.
$ # 2.For multimodal services, it is recommended to allocate 16GB or more of shared memory.
$ # You can adjust this value according to your specific requirements.
$ #
$ # If you do not have enough shared memory available, you can try lowering
$ # the --running_max_req_size parameter when starting the service.
$ # This will reduce the number of concurrent requests, but also decrease shared memory usage.
$ docker run -it --gpus all -p 8080:8080            \
$   --shm-size 2g -v your_local_path:/data/         \
$   ghcr.io/modeltc/lightllm:main /bin/bash

You can also manually build the image from source and run it:

$ # Manually build the image
$ docker build -t <image_name> .
$
$ # Run,
$ docker run -it --gpus all -p 8080:8080            \
$   --shm-size 2g -v your_local_path:/data/         \
$   <image_name> /bin/bash

Or you can directly use the script to launch the image and run it with one click:

$ # View script parameters
$ python tools/quick_launch_docker.py --help

Note

If you use multiple GPUs, you may need to increase the –shm-size parameter setting above. If you need to run DeepSeek models in EP mode, please use the image ghcr.io/modeltc/lightllm:main-deepep.

Installation from Source#

You can also install Lightllm from source:

$ # (Recommended) Create a new conda environment
$ conda create -n lightllm python=3.9 -y
$ conda activate lightllm
$
$ # Download the latest Lightllm source code
$ git clone https://github.com/ModelTC/lightllm.git
$ cd lightllm
$
$ # Install Lightllm dependencies (cuda 12.4)
$ pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu124
$
$ # Install Lightllm
$ python setup.py install

Note

Lightllm code has been tested on various GPUs including V100, A100, A800, 4090, and H800. If you use A100, A800 and other graphics cards, it is recommended to install triton==3.0.0:

$ pip install triton==3.0.0 --no-deps

If you use H800, V100 and other graphics cards, it is recommended to install triton-nightly:

$ pip install -U --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/Triton-Nightly/pypi/simple/ triton-nightly --no-deps

For specific reasons, please refer to: issue and fix PR

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4