A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/rahoua/pecca-rs below:

GitHub - rahoua/pecca-rs

Pecca is starting as a Rust port of the excellent @karpathy llama2.c, itself a minimalistic adaptation of llama.cpp.

Compared to other Rust ports, Pecca leverages ndarray, which has several advantages:

Going forward, Pecca will leverage Rust and its ecosystem whenever it makes sense, rather than attempting to avoid dependencies above all (like llama.cpp).

git clone https://github.com/rahoua/pecca-rs.git
cd pecca-rs
wget -P ./models/stories/  https://huggingface.co/karpathy/tinyllamas/resolve/main/stories15M.bin
cargo run --release generate ./models/stories/stories15M.bin

Pecca can be run similarly with larger tiny stories models (like the 110M one) or the llama2 models (only 7B recommended so far). For a full list of command line options run:

To get the llama2 models, follow the instructions for llama2.c. Pecca supports the same model format. As Pecca does not use memmap, loading and quantizing the model on the fly can take some time. To speed things up, the models can also be saved quantized using the -f --write-model <path> command line switch.

For codellama, the instructions are similar except for the tokenizer which is slightly different. To make the process easier, the updated tokenizer is provided. To override the default tokenizer, run pecca using the -k command line option:

./target/release/pecca-rs generate /path/to/codellama-instr-7b.bin -k "./models/tokenizer-code.bin"

At the moment there's no formal benchmark, we just provide rough estimates to give a ballpark of overall performance.

Llama2 7B model on a Macbook Pro M2 Max:

A list of possible future developments for the project:


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4