This is a port of Andrej Karpathy’s llm.c project (the CPU version). I toyed around with it the day he released the initial version for a couple of hours, but only continued with it on a train / plane trip I had last weekend (<2024-04-20 Sat>). In particular we started from commit a22c22b
. In particular this means the tokenizer is not there at the moment. I might add it one of these days.
Performance is ~comparable to the C version.
Note: the port was done in a bit of a hurry, so who knows what bugs lurk compared to the original! :)
Differences to the C versionMView[T]
type (which is just a ptr UncheckedArray[T]
in Nim lang + a few goodies, notably a {}
accessor to do pointer arithmetic for a more ‘natural’ access to another buffer.*Tensor
fields based on fieldPairs
, their order in the object
and the params/act_sizes
input.GPT2
and DataLoader
objects so that we don’t have to free manually (and copying these is disallowed)collapse
primitive to fuse multiple nested loops, we use a custom Nim CT based loop fusion macro, see ./fuse_loops.nim. The issue is that because Nim converts for
loops into while
statements, it doesn’t play nice with nested loops for OpenMP. :) So I wrote a macro that wraps around for loops and performs the loop fusion manually (note that it only works for for i in 0 ..< X
style loops (i.e. lower index 0 and using ..<
).We have to compile with --exceptions:quirky
, because otherwise the inserted Nim error checks break the OpenMP compilation. We could disable checks locally in the code, but for this here it’s fine.
See also: nim-lang/Nim#23311
The important compilation arguments are defined in a local nim.cfg
and at the top of the train_gpt2.nim
file (fast-math and OpenMP related).
Otherwise just compile with:
nim c -d:danger -d:openmp -d:lto --passC:"-march=native" train_gpt2.nim
Otherwise follow the CPU instructions from the original repo to get started: https://github.com/karpathy/llm.c?tab=readme-ov-file#quick-start-cpu
Similar to my Nim port of his llama2.c, I had time to kill on a trip! And doing such ‘dumb’ ports is kind of meditative… lol
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4