A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/sgl-project/sglang/issues/9104 below:

[Feature] Support AWQ quantization on NPU · Issue #9104 · sgl-project/sglang · GitHub

Checklist Motivation

AWQ is a high-performance 4-bit weight quantization method that offers excellent trade-offs between efficiency and accuracy. By enabling AWQ quantization on the NPU backend in SGLang, we can allow all 8-card NPUs to run the DeepSeek 671B model. This feature follows the Roadmap#8004 of NPU.

Proposal

Related resources

No response


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4