Attention Gym is a collection of helpful tools and examples for working with flex-attention
🎯 Features | 🚀 Getting Started | 💻 Usage | 🛠️ Dev | 🤝 Contributing | ⚖️ License
This repository aims to provide a playground for experimenting with various attention mechanisms using the FlexAttention API. It includes implementations of different attention variants, performance comparisons, and utility functions to help researchers and developers explore and optimize attention mechanisms in their models.
git clone https://github.com/pytorch-labs/attention-gym.git cd attention-gym pip install .
There are two main ways to use Attention Gym:
Run Example Scripts: Many files in the project can be executed directly to demonstrate their functionality:
python attn_gym/masks/document_mask.py
These scripts often generate visualizations to help you understand the attention mechanisms.
Import in Your Projects: You can use Attention Gym components in your own work by importing them:
from torch.nn.attention.flex_attention import flex_attention, create_block_mask from attn_gym.masks import generate_sliding_window # Use the imported function in your code sliding_window_mask_mod = generate_sliding_window(window_size=1024) block_mask = create_block_mask(sliding_window_mask_mod, 1, 1, S, S, device=device) out = flex_attention(query, key, value, block_mask=block_mask)
For comprehensive examples of using FlexAttention in real-world scenarios, explore the examples/
directory. These end-to-end implementations showcase how to integrate various attention mechanisms into your models.
Attention Gym is under active development, and we do not currently offer any backward compatibility guarantees. APIs and functionalities may change between versions. We recommend pinning to a specific version in your projects and carefully reviewing changes when upgrading.
Attention Gym is organized for easy exploration of attention mechanisms:
attn_gym.masks
: Examples creating BlockMasks
attn_gym.mods
: Examples creating score_mods
attn_gym.paged_attention
: Examples using PagedAttention
examples/
: Detailed implementations using FlexAttentionInstall dev requirements
Install pre-commit hooks
We welcome contributions to Attention Gym, especially new Masks or score mods! Here's how you can contribute:
attn_gym/*/__init__.py
file to include your new function.See CONTRIBUTING.md for more details.
attention-gym is released under the BSD 3-Clause License.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4