A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/rbitr/llm.f90/issues/3 below:

Performance · Issue #3 · rbitr/llm.f90 · GitHub

Thank you so much for writing this. We are now working on compiling it with LFortran, this is a great example.

Regarding performance on my Apple M1 Max with GFortran 11.3.0, I get about 240 tokens/s with the default gfortran options. With -O3 -march=native -ffast-math -funroll-loops I get about 277 tokens/s. Finally, with gfortran -O3 -march=native -ffast-math -funroll-loops -fexternal-blas llama2.f90 -o llm -framework Accelerate which should be the fastest, I still only get about 270 tokens/s. I think this is too small of a model, one would have to try a larger version to take advantage of the accelerated linear algebra.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4