RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/theroyallab/tabbyAPI/wiki/01.-Getting-Started below:

01. Getting Started · theroyallab/tabbyAPI Wiki · GitHub

To get started, make sure you have the following installed on your system:

Python 3.x (preferably 3.11) with pip
- Do NOT install python from the Microsoft store! This will cause issues with pip.
- Alternatively, you can use miniconda or uv if it's present on your system.

Note

Prefer a video guide? Watch the step-by-step tutorial on YouTube

Warning

CUDA and ROCm aren't prerequisites because torch can install them for you. However, if this doesn't work (ex. DLL load failed), install the CUDA toolkit or ROCm on your system.

Warning

Sometimes there may be an error with Windows that VS build tools needs to be installed. This means that there's a package that isn't supported for your python version. You can install VS build tools 17.8 and build the wheel locally. In addition, open an issue stating that a dependency is building a wheel.

Clone this repository to your machine: git clone https://github.com/theroyallab/tabbyAPI
Navigate to the project directory: cd tabbyAPI
Run the appropriate start script (start.bat for Windows and start.sh for linux).
1. Follow the on-screen instructions and select the correct GPU library.
2. Assuming that the prerequisites are installed and can be located, a virtual environment will be created for you and dependencies will be installed.
The API should start with no model loaded. Please read more to see how to download a model.

Follow steps 1-2 in the For Beginners section
Create a python environment through venv:
1. python -m venv venv
2. Activate the venv
  1. On Windows: .\venv\Scripts\activate
  2. On Linux: source venv/bin/activate
Install the pyproject features based on your system:
1. Cuda 12.x: pip install -U .[cu121]
2. ROCm 5.6: pip install -U .[amd]
Start the API by either
1. Run start.bat/sh. The script will check if you're in a conda environment and skip venv checks.
2. Run python main.py to start the API. This won't automatically upgrade your dependencies.

TabbyAPI includes a built-in Hugging Face downloader that works via both the API and terminal. You can use the following command to download a repository with a specific branch revision:

.\Start.bat download <repo name> --revision <branch>

Example with Turboderp's Llama 3.1 8B quants:

.\Start.bat download turboderp/Qwen2.5-VL-7B-Instruct-exl2 --revision 4.0bpw

If a model is gated, you can provide a HuggingFace access token (most exl2 quants aren't private):

.\Start.bat download meta-llama/Llama-3.1-8B --token <token>

Alternatively, running main.py directly can also trigger the downloader. For additional options, run .\Start.bat download --help

Loading solely the API may not be your optimal usecase. Therefore, a config.yml exists to tune initial launch parameters and other configuration options.

A config.yml file is required for overriding project defaults. If you are okay with the defaults, you don't need a config file!

If you do want a config file, copy over config_sample.yml to config.yml. All the fields are commented, so make sure to read the descriptions and comment out or remove fields that you don't need.

In addition, if you want to manually set the API keys, copy over api_keys_sample.yml to api_keys.yml and fill in the fields. However, doing this is less secure and autogenerated keys should be used instead.

You can also access the configuration parameters under 2. Configuration in this wiki!

Take a look at the usage docs
Get started with community projects: Find loaders, UIs, and more created by the wider AI community. Any OAI compatible client is also supported.

There are a couple ways to update TabbyAPI:

Update scripts - Inside the update_scripts folder, you can run the following scripts:
1. update_deps: Updates dependencies to their latest versions.
2. update_deps_and_pull: Updates dependencies and pulls the latest commit of the Github repository.

These scripts exit after running their respective tasks. To start TabbyAPI, run start.bat or start.sh.

Manual - Install the pyproject features and update dependencies depending on your GPU:
1. pip install -U .[cu121] = CUDA 12.x
2. pip install -U .[amd] = ROCm 6.0

If you don't want to update dependencies that come from wheels (torch, exllamav2, and flash attention 2), use pip install . or pass the --nowheel flag when invoking the start scripts.

Warning

These instructions are meant for advanced users.

Important

If you're installing a custom Exllamav2 wheel, make sure to use pip install . when updating! Otherwise, each update will overwrite your custom exllamav2 version.

NOTE:

TabbyAPI enforces the latest Exllamav2 version for compatibility purposes.
Any upgrades using a pyproject gpu lib feature will result in overwriting your installed wheel.
- To fix this, change the feature in pyproject.toml locally, create an issue or PR, or install your version of exllamav2 after upgrades.

Here are ways to install exllamav2:

From a wheel/release (Recommended)
1. Find the version that corresponds with your cuda and python version. For example, a wheel with cu121 and cp311 corresponds to CUDA 12.1 and python 3.11
From pip: pip install exllamav2 2. This is a JIT compiled extension, which means that the initial launch of tabbyAPI will take some time. The build may also not work due to improper environment configuration.
From source

Other installation methods

These are short-form instructions for other methods that users can use to install TabbyAPI.

Warning

Using methods other than venv may not play nice with startup scripts. Using these methods indicates that you're an advanced user and know what you're doing.

Install Miniconda3 with python 3.11 as your base python
Create a new conda environment conda create -n tabbyAPI python=3.11
Activate the conda environment conda activate tabbyAPI
Install optional dependencies if they aren't present
1. CUDA via
  1. CUDA 12 - conda install -c "nvidia/label/cuda-12.4.1" cuda
2. Git via conda install -k git
Clone TabbyAPI via git clone https://github.com/theroyallab/tabbyAPI
Continue installation steps from:
1. For Beginners - Step 3. The start scripts detect if you're in a conda environment and skips the venv check.
2. For Advanced Users - Step 3

Note

If you are planning to use custom versions of dependencies such as dev ExllamaV2, make sure to build the Docker image yourself!

Install Docker and docker compose from the [docs](https://docs.docker.com/compose/install/
Install the Nvidia container compatibility layer
1. For Linux: Nvidia container toolkit
2. For Windows: Cuda Toolkit on WSL
Clone TabbyAPI via git clone https://github.com/theroyallab/tabbyAPI
Enter the tabbyAPI directory by cd tabbyAPI.
1. Optional: Set up a config.yml or api_tokens.yml (configuration)
Update the volume mount section in the docker/docker-compose.yml file

volumes:
  # - /path/to/models:/app/models                       # Change me
  # - /path/to/config.yml:/app/config.yml               # Change me
  # - /path/to/api_tokens.yml:/app/api_tokens.yml       # Change me

Optional: If you'd like to build the dockerfile from source, follow the instructions below in docker/docker-compose.yml:

    # Uncomment this to build a docker image from source
    #build:
    #  context: ..
    #  dockerfile: ./docker/Dockerfile

    # Comment this to build a docker image from source
    image: ghcr.io/theroyallab/tabbyapi:latest

Run docker compose -f docker/docker-compose.yml up to build the dockerfile and start the server.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4