OmniGen2/OmniGen2
๐
akhaliq/OmniGen2
๐
sbrandeis/OmniGen2-manual
๐ฆ
vzhizhi6611/OmniGen2_x
๐
azhan77168/og2
๐
honeymelon/OmniGen2-2
๐จ
BladeSzaSza/digiPal
๐
svjack/OmniGen2
๐
bep40/OmniGen2
๐
evalstate/OmniGen2
๐
zOwOs/og2
๐
rinmix/custom_Omnigen_Space
News | Quick Start | Usage Tips | Online Demos | Citation | License
๐ฅ Newsflash-attn
. Users can still install it for optimal performance.OmniGen2 is a powerful and efficient unified multimodal model. Unlike OmniGen v1, OmniGen2 features two distinct decoding pathways for text and image modalities, utilizing unshared parameters and a decoupled image tokenizer. OmniGen2 has competitive performance across four primary capabilities:
As an open-source project, OmniGen2 provides a powerful yet resource-efficient foundation for researchers and developers exploring the frontiers of controllable and personalized generative AI.
We will release the training code, dataset, and data construction pipeline soon. Stay tuned!
Demonstration of OmniGen2's overall capabilities.
Demonstration of OmniGen2's image editing capabilities.
Demonstration of OmniGen2's in-context generation capabilities.
git clone git@github.com:VectorSpaceLab/OmniGen2.git
cd OmniGen2
conda create -n omnigen2 python=3.11
conda activate omnigen2
pip install torch==2.6.0 torchvision --extra-index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt
pip install flash-attn==2.7.4.post1 --no-build-isolation
๐ For users in Mainland China
pip install torch==2.6.0 torchvision --index-url https://mirror.sjtu.edu.cn/pytorch-wheels/cu124
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install flash-attn==2.7.4.post1 --no-build-isolation -i https://pypi.tuna.tsinghua.edu.cn/simple
๐งช Run Examples
bash example_understanding.sh
bash example_t2i.sh
bash example_edit.sh
bash example_in_context_generation.sh
๐ Gradio Demo
Online Demo: HF Spaces. Beyond Hugging Face Spaces, we are temporarily allocating additional GPU resources to ensure smooth access to the online demos. If you notice a long queue for a particular link, please try other links:
pip install gradio
python app.py
python app.py --share
pip install gradio
python app_chat.py
To achieve optimal results with OmniGen2, you can adjust the following key hyperparameters based on your specific use case.
text_guidance_scale
: Controls how strictly the output adheres to the text prompt (Classifier-Free Guidance).image_guidance_scale
: This controls how much the final image should resemble the input reference image.
max_pixels
: Automatically resizes images when their total pixel count (width ร height) exceeds this limit, while maintaining its aspect ratio. This helps manage performance and memory usage.
max_input_image_side_length
: Maximum side length for input images.negative_prompt
: Tell the model what you don't want to see in the image.
enable_model_cpu_offload
: Reduces VRAM usage by nearly 50% with a negligible impact on speed.
enable_sequential_cpu_offload
: Minimizes VRAM usage to less than 3GB, but at the cost of significantly slower performance.
cfg_range_start
, cfg_range_end
: Define the timestep range where CFG is applied. Per this paper, reducing cfg_range_end
can significantly decrease inference time with a negligible impact on quality.Some suggestions for improving generation quality:
OmniGen2 natively requires an NVIDIA RTX 3090 or an equivalent GPU with approximately 17GB of VRAM. For devices with less VRAM, you can enable CPU Offload to run the model.
Performance Tip: To improve inference speed, consider decreasing the cfg_range_end
parameter. Within a reasonable range, this has a negligible impact on output quality.
The following table details the inference performance of OmniGen2 on an A800 GPU:
Inference Efficiency of OmniGen2.
If you find this repository or our work useful, please consider giving a star โญ and citation ๐ฆ, which would be greatly appreciated (OmniGen2 report will be available as soon as possible):
@article{wu2025omnigen2,
title={OmniGen2: Exploration to Advanced Multimodal Generation},
author={Chenyuan Wu and Pengfei Zheng and Ruiran Yan and Shitao Xiao and Xin Luo and Yueze Wang and Wanli Li and Xiyan Jiang and Yexin Liu and Junjie Zhou and Ze Liu and Ziyi Xia and Chaofan Li and Haoge Deng and Jiahao Wang and Kun Luo and Bo Zhang and Defu Lian and Xinlong Wang and Zhongyuan Wang and Tiejun Huang and Zheng Liu},
journal={arXiv preprint arXiv:2506.18871},
year={2025}
}
License
This work is licensed under Apache 2.0 license.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4