A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/js1010/cusim below:

js1010/cusim: Superfast CUDA implementation of Word2Vec and Latent Dirichlet Allocation (LDA)

Superfast CUDA implementation of Word2Vec and Latent Dirichlet Allocation (LDA)

This project is to speed up various ML models (e.g. topic modeling, word embedding, etc) by CUDA. It would be nice to think of it as gensim's GPU version project. As a starting step, I implemented the most widely used word embedding model, the word2vec model, and the most representative topic model, the LDA (Latent Dirichlet Allocation) model.

# clone repo and submodules
git clone git@github.com:js1010/cusim.git && cd cusim && git submodule update --init

# install requirements
pip install -r requirements.txt

# generate proto
python -m grpc_tools.protoc --python_out cusim/ --proto_path cusim/proto/ config.proto

# install
python setup.py install
attr 1 workers (gensim) 2 workers (gensim) 4 workers (gensim) 8 workers (gensim) NVIDIA T4 (cusim) training time (sec) 892.596 544.212 310.727 226.472 16.162 pearson 0.487832 0.487696 0.482821 0.487136 0.492101 spearman 0.500846 0.506214 0.501048 0.506718 0.479468 attr 1 workers (gensim) 2 workers (gensim) 4 workers (gensim) 8 workers (gensim) NVIDIA T4 (cusim) training time (sec) 586.545 340.489 220.804 146.23 33.9173 pearson 0.354448 0.353952 0.352398 0.352925 0.360436 spearman 0.369146 0.369365 0.370565 0.365822 0.355204 attr 1 workers (gensim) 2 workers (gensim) 4 workers (gensim) 8 workers (gensim) NVIDIA T4 (cusim) training time (sec) 250.135 155.121 103.57 73.8073 6.20787 pearson 0.309651 0.321803 0.324854 0.314255 0.480298 spearman 0.294047 0.308723 0.318293 0.300591 0.480971 attr 1 workers (gensim) 2 workers (gensim) 4 workers (gensim) 8 workers (gensim) NVIDIA T4 (cusim) training time (sec) 176.923 100.369 69.7829 49.9274 9.90391 pearson 0.18772 0.193152 0.204509 0.187924 0.368202 spearman 0.243975 0.24587 0.260531 0.237441 0.358042 attr gensim (8 vpus) cusim (NVIDIA T4) training time (sec) 447.376 76.6972

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4