Showing content from https://github.com/cuda-mode/lectures below:
gpu-mode/lectures: Material for gpu-mode lectures
Supplementary Material for Lectures
YouTube Channel
The PMPP Book: Programming Massively Parallel Processors: A Hands-on Approach (Amazon link)
Lecture 1: Profiling and Integrating CUDA kernels in PyTorch
Lecture 2: Recap Ch. 1-3 from the PMPP book
Lecture 3: Getting Started With CUDA
Lecture 4: Intro to Compute and Memory Architecture
Lecture 5: Going Further with CUDA for Python Programmers
Lecture 6: Optimizing PyTorch Optimizers
Lecture 7: Advanced Quantization
Lecture 8: CUDA Performance Checklist
Lecture 9: Reductions
Lecture 10: Build a Prod Ready CUDA Library
Lecture 11: Sparsity
Lecture 12: Flash Attention
Lecture 13: Ring Attention
Lecture 14: Practitioner's Guide to Triton
Lecture 15: CUTLASS
Lecture 16: On Hands profiling
Bonus Lecture: CUDA C++ llm.cpp
Lecture 17: GPU Collective Communication (NCCL)
Lecture 18: Fused Kernels
Lecture 19: Data Processing on GPUs
Lecture 20: Scan Algorithm
Lecture 21: Scan Algorithm Part 2
Lecture 22: Hacker's Guide to Speculative Decoding in VLLM
Lecture 23: Tensor Cores
- Speaker: Vijay Thakkar & Pradeep Ramani
- Slides
Lecture 24: Scan at the Speed of Light
- Speaker: Jake Hemstad & Georgii Evtushenko
Lecture 25: Speaking Composable Kernel
Lecture 26: SYCL MODE (Intel GPU)
Lecture 27: gpu.cpp
Lecture 28: Liger Kernel
Lecture 29: Triton Internals
Lecture 30: Quantized training
Lecture 31: Beginners Guide to Metal Kernels
Lecture 32: Unsloth - LLM Systems Engineering
Lecture 33: BitBLAS
Lecture 34: Low Bit Triton Kernels
Lecture 35: SGLang Performance Optimization
Lecture 36: CUTLASS and Flash ATtention 3
Lecture 37: Introduction to SASS & GPU Microarchitecture
Lecture 38: Lowbit kernels for ARM CPU
Lecture 39: TorchTitan
- Speaker: Mark Saroufim and Tianyu Liu
Lecture 40: Flash Infer
Lecture 41: CUDA Docs for Humans
Lecture 42: Mosaic GPU
Lecture 43:
- Speaker: Erik Schultheis
- Slides
Lecture 57:
RetroSearch is an open source project built by @garambo
| Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4