A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://huggingface.co/OmniGen2/OmniGen2 below:

Website Navigation


OmniGen2/OmniGen2 ยท Hugging Face

News | Quick Start | Usage Tips | Online Demos | Citation | License

๐Ÿ”ฅ News Introduction

OmniGen2 is a powerful and efficient unified multimodal model. Unlike OmniGen v1, OmniGen2 features two distinct decoding pathways for text and image modalities, utilizing unshared parameters and a decoupled image tokenizer. OmniGen2 has competitive performance across four primary capabilities:

As an open-source project, OmniGen2 provides a powerful yet resource-efficient foundation for researchers and developers exploring the frontiers of controllable and personalized generative AI.

We will release the training code, dataset, and data construction pipeline soon. Stay tuned!


Demonstration of OmniGen2's overall capabilities.


Demonstration of OmniGen2's image editing capabilities.


Demonstration of OmniGen2's in-context generation capabilities.

๐Ÿ“Œ TODO ๐Ÿš€ Quick Start ๐Ÿ› ๏ธ Environment Setup โœ… Recommended Setup

git clone git@github.com:VectorSpaceLab/OmniGen2.git
cd OmniGen2


conda create -n omnigen2 python=3.11
conda activate omnigen2



pip install torch==2.6.0 torchvision --extra-index-url https://download.pytorch.org/whl/cu124


pip install -r requirements.txt




pip install flash-attn==2.7.4.post1 --no-build-isolation
๐ŸŒ For users in Mainland China

pip install torch==2.6.0 torchvision --index-url https://mirror.sjtu.edu.cn/pytorch-wheels/cu124


pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple




pip install flash-attn==2.7.4.post1 --no-build-isolation -i https://pypi.tuna.tsinghua.edu.cn/simple
๐Ÿงช Run Examples

bash example_understanding.sh


bash example_t2i.sh


bash example_edit.sh


bash example_in_context_generation.sh
๐ŸŒ Gradio Demo ๐Ÿ’ก Usage Tips

To achieve optimal results with OmniGen2, you can adjust the following key hyperparameters based on your specific use case.

Some suggestions for improving generation quality:

  1. Use High-Quality Images
  1. Be Specific with Instructions
  1. Prioritize English The model currently performs best with English prompts.
๐Ÿ’ป Resources Requirement

OmniGen2 natively requires an NVIDIA RTX 3090 or an equivalent GPU with approximately 17GB of VRAM. For devices with less VRAM, you can enable CPU Offload to run the model.

Performance Tip: To improve inference speed, consider decreasing the cfg_range_end parameter. Within a reasonable range, this has a negligible impact on output quality.

The following table details the inference performance of OmniGen2 on an A800 GPU:


Inference Efficiency of OmniGen2.

โค๏ธ Citing Us

If you find this repository or our work useful, please consider giving a star โญ and citation ๐Ÿฆ–, which would be greatly appreciated (OmniGen2 report will be available as soon as possible):

@article{wu2025omnigen2,
  title={OmniGen2: Exploration to Advanced Multimodal Generation},
  author={Chenyuan Wu and Pengfei Zheng and Ruiran Yan and Shitao Xiao and Xin Luo and Yueze Wang and Wanli Li and Xiyan Jiang and Yexin Liu and Junjie Zhou and Ze Liu and Ziyi Xia and Chaofan Li and Haoge Deng and Jiahao Wang and Kun Luo and Bo Zhang and Defu Lian and Xinlong Wang and Zhongyuan Wang and Tiejun Huang and Zheng Liu},
  journal={arXiv preprint arXiv:2506.18871},
  year={2025}
}
License

This work is licensed under Apache 2.0 license.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4