A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/ModelCloud/GPTQModel/releases/tag/v1.8.1 below:

Release GPTQModel v1.8.1 · ModelCloud/GPTQModel · GitHub

What's Changed

DeekSeek v3/R1 model support.
⚡ New flexible weight packing: allow quantized weights to be packed to [int32, int16, int8] dtypes. Triton and Torch kernels supports full range of new QuantizeConfig.pack_dtype.
⚡ Over 50% speedup for vl model quantization (Qwen 2.5-VL + Ovis)
⚡ New auto_gc: bool control in quantize() which can reduce quantization time for small model with no chance of oom.
⚡ New GPTQModel.push_to_hub() api for easy quant model upload to HF repo.
⚡ New buffered_fwd: bool control in model.quantize().
🐛 Fixed bits=3 packing and group_size=-1 regression in v1.7.4.
🐛 Fixed Google Colab install requiring two install passes
🐛 Fixed Python 3.10 compatibility

Full Changelog: v1.7.4...v1.8.1


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4