A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/itzmeanjan/ff-gpu below:

itzmeanjan/ff-gpu: Finite Field Operations on GPGPU

Finite Field Operations on GPGPU

In recent times, I've been interested in Finite Field operations, so I decided to implement few fields in SYCL DPC++, targeting accelerators ( specifically GPGPUs ).

In this repository, currently I keep implementation of two finite field's arithmetic operations, accompanied with relevant benchmarks on both CPU, GPGPU.

I've also written following implementations, along with benchmark results on CPU, GPU.

$ lsb_release -d

Description:    Ubuntu 20.04.3 LTS
$ dpcpp --version

Intel(R) oneAPI DPC++/C++ Compiler 2022.0.0 (2022.0.0.20211123)
Target: x86_64-unknown-linux-gnu
Thread model: posix

or

$ clang++ --version

clang version 14.0.0 (https://github.com/intel/llvm dc9bd3fafdeacd28528eb4b1fef3ad9b76ef3b92)
Target: x86_64-unknown-linux-gnu
Thread model: posix
make # JIT kernel compilation on *default* device, for AOT read below
./run
DEVICE=cpu make   # still JIT, but in runtime use CPU
DEVICE=gpu make   # still JIT, but in runtime use GPU
DEVICE=host make  # still JIT, but in runtime use HOST

You may have some other hardware, consider taking a look at AOT compilation guidelines & make necessary changes in Makefile.

Targeting Nvidia GPU with CUDA backend :

For targeting Nvidia GPU, you want to run DEVICE=gpu make cuda, so that benchmark suite is compiled for CUDA backend. I suggest you read this for setting up your machine with Nvidia GPU, if you've not yet.

I run benchmark suite on both Intel CPU/ GPU and Nvidia GPU, keeping results 👇

You can run basic test cases using

# set variable to runtime target device

DEVICE=cpu|gpu|host make test 

There's another set of randomised test cases, which asserts results ( obtained from my prime field implementation ) with another finite field implementation module, written in Python, named galois.

For running those, I suggest you first compile shared object using

# set variable to runtime target device

DEVICE=cpu|gpu|host make genlib

After that you can follow next steps here.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4