A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/ModelTC/Qwen-Image-Lightning below:

ModelTC/Qwen-Image-Lightning: Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

We are excited to release the distilled version of Qwen-Image. It preserves the capability of complex text rendering.

To assess the distilled models' performance characteristics, including their strengths and limitations, we compare the performance of the three models, i.e., Qwen-Image, Qwen-Image-Lightning-8steps-V1.1, and Qwen-Image-Lightning-4steps-V1.0, in different scenarios. The results can be reproduced following the section below.

Compared to the base model, the distilled models (8-step and 4-step) deliver a 12–25× speed improvement with no significant loss in performance in most cases.

Prompt Base NFE=100 8steps-V1.1 NFE=8 4steps-V1.0 NFE=4 一个会议室,墙上写着"3.14159265-358979-32384626-4338327950",一个小陀螺在桌上转动。 宫崎骏的动漫风格。平视角拍摄,阳光下的古街热闹非凡。一个穿着青衫、手里拿着写着“阿里云”卡片的逍遥派弟子站在中间。旁边两个小孩惊讶的看着他。左边有一家店铺挂着“云存储”的牌子,里面摆放着发光的服务器机箱,门口两个侍卫守护者。右边有两家店铺,其中一家挂着“云计算”的牌子,一个穿着旗袍的美丽女子正看着里面闪闪发光的电脑屏幕;另一家店铺挂着“云模型”的牌子,门口放着一个大酒缸,上面写着“千问”,一位老板娘正在往里面倒发光的代码溶液。 一副典雅庄重的对联悬挂于厅堂之中,房间是个安静古典的中式布置,桌子上放着一些青花瓷,对联上左书“义本生知人机同道善思新”,右书“通云赋智乾坤启数高志远”, 横批“智启通义”,字体飘逸,中间挂在一着一副中国风的画作,内容是岳阳楼。 A movie poster. The first row is the movie title, which reads “Imagination Unleashed”. The second row is the movie subtitle, which reads “Enter a world beyond your imagination”. The third row reads “Cast: Qwen-Image”. The fourth row reads “Director: The Collective Imagination of Humanity”. The central visual features a sleek, futuristic computer from which radiant colors, whimsical creatures, and dynamic, swirling patterns explosively emerge, filling the composition with energy, motion, and surreal creativity. The background transitions from dark, cosmic tones into a luminous, dreamlike expanse, evoking a digital fantasy realm. At the bottom edge, the text “Launching in the Cloud, August 2025” appears in bold, modern sans-serif font with a glowing, slightly transparent effect, evoking a high-tech, cinematic aesthetic. The overall style blends sci-fi surrealism with graphic design flair—sharp contrasts, vivid color grading, and layered visual depth—reminiscent of visionary concept art and digital matte painting, 32K resolution, ultra-detailed. 一张企业级高质量PPT页面图像,整体采用科技感十足的星空蓝为主色调,背景融合流动的发光科技线条与微光粒子特效,营造出专业、现代且富有信任感的品牌氛围;页面顶部左侧清晰展示橘红色Alibaba标志,色彩鲜明、辨识度高。主标题位于画面中央偏上位置,使用大号加粗白色或浅蓝色字体写着“通义千问视觉基础模型”,字体现代简洁,突出技术感;主标题下方紧接一行楷体中文文字:“原生中文·复杂场景·自动布局”,字体柔和优雅,形成科技与人文的融合。下方居中排布展示了四张与图片,分别是:一幅写实与水墨风格结合的梅花特写,枝干苍劲、花瓣清雅,背景融入淡墨晕染与飘雪效果,体现坚韧不拔的精神气质;上方写着黑色的楷体"梅傲"。一株生长于山涧石缝中的兰花,叶片修长、花朵素净,搭配晨雾缭绕的自然环境,展现清逸脱俗的文人风骨;上方写着黑色的楷体"兰幽"。一组迎风而立的翠竹,竹叶随风摇曳,光影交错,背景为青灰色山岩与流水,呈现刚柔并济、虚怀若谷的文化意象;上方写着黑色的楷体"竹清"。一片盛开于秋日庭院的菊花丛,花色丰富、层次分明,配以落叶与古亭剪影,传递恬然自适的生活哲学;上方写着黑色的楷体"菊淡"。所有图片采用统一尺寸与边框样式,呈横向排列。页面底部中央用楷体小字写明“2025年8月,敬请期待”,排版工整、结构清晰,整体风格统一且细节丰富,极具视觉冲击力与品牌调性。 - Dense or Small Text Rendering

In scenarios involving dense or small text, the base model is more likely to produce better results.

In scenes containing hair-like details, the base model demonstrates superior rendering fidelity, whereas the distilled models may yield outputs that appear either noticeably blurred or excessively sharpened.

In highly complex scenes, all three models may fail to produce satisfactory results.

- Inconsistencies in Model Rankings Across Test Cases

Test results may vary across different cases. In certain test instances, the base model may perform better, whereas in others, the distilled models may achieve superior results. Even for the same prompt at different resolutions, the relative performance ranking of the models may differ substantially.

Prompt Base NFE=100 8steps-V1.1 NFE=8 4steps-V1.0 NFE=4 A young girl wearing school uniform stands in a classroom, writing on a chalkboard. The text "Introducing Qwen-Image, a foundational image generation model that excels in complex text rendering and precise image editing" appears in neat white chalk at the center of the blackboard. Soft natural light filters through windows, casting gentle shadows. The scene is rendered in a realistic photography style with fine details, shallow depth of field, and warm tones. The girl's focused expression and chalk dust in the air add dynamism. Background elements include desks and educational posters, subtly blurred to emphasize the central action. Ultra-detailed 32K resolution, DSLR-quality, soft bokeh effect, documentary-style composition. ❌ ✅ ✅ A coffee shop entrance features a chalkboard sign reading "Qwen Coffee 😊 $2 per cup," with a neon light beside it displaying "通义千问". Next to it hangs a poster showing a beautiful Chinese woman, and beneath the poster is written "π≈3.1415926-53589793-23846264-33832795-02384197". ❌ ✅ ✅ A coffee shop entrance features a chalkboard sign reading "Qwen Coffee 😊 $2 per cup," with a neon light beside it displaying "通义千问". Next to it hangs a poster showing a beautiful Chinese woman, and beneath the poster is written "π≈3.1415926-53589793-23846264-33832795-02384197". ✅ ✅ ❌ 🚀 Run Evaluation and Test

Please follow Qwen-Image to install the Python Environment and download the Base Model.

Download models using huggingface-cli:

pip install "huggingface_hub[cli]"
huggingface-cli download lightx2v/Qwen-Image-Lightning --local-dir ./Qwen-Image-Lightning
# 8 steps, cfg 1.0
python generate_with_diffusers.py \
--prompt_list_file examples/prompt_list.txt \
--out_dir test_lora_8_step_results \
--lora_path Qwen-Image-Lightning/Qwen-Image-Lightning-8steps-V1.0.safetensors \
--base_seed 42 --steps 8 --cfg 1.0
# 4 steps, cfg 1.0
python generate_with_diffusers.py \
--prompt_list_file examples/prompt_list.txt \
--out_dir test_lora_4_step_results \
--lora_path Qwen-Image-Lightning/Qwen-Image-Lightning-4steps-V1.0.safetensors \
--base_seed 42 --steps 4 --cfg 1.0
# 50 steps, cfg 4.0
python generate_with_diffusers.py \
--prompt_list_file examples/prompt_list.txt \
--out_dir test_base_results \
--base_seed 42 --steps 50 --cfg 4.0

ComfyUI workflow is available in the workflows/ directory. The workflow is based on the Qwen-Image ComfyUI tutorial and has been verified with ComfyUI repository at commit ID 37d620a6b85f61b824363ed8170db373726ca45a.

  1. Install ComfyUI following the official instructions
  2. Download and place the Qwen-Image base model following the Qwen-Image ComfyUI tutorial (include UNet/CLIP/VAE files into proper ComfyUI folders)
  3. For 8-step workflow:
  4. For 4-step workflow:
  5. Run the workflow to generate images

The models in this repository are licensed under the Apache 2.0 License. We claim no rights over your generated contents, granting you the freedom to use them while ensuring that your usage complies with the provisions of this license. You are fully accountable for your use of the models, which must not involve sharing any content that violates applicable laws, causes harm to individuals or groups, disseminates personal information intended for harm, spreads misinformation, or targets vulnerable populations. For a complete list of restrictions and details regarding your rights, please refer to the full text of the license.

We built upon and reused code from the following projects: Qwen-Image, licensed under the Apache License 2.0.

The evaluation text prompts are from Qwen-Image, Qwen-Image Blog and Qwen-Image-Service.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4