Objective: Refactor the code to enhance the maintainability and extensibility of the quantization module.
Objective: Extend sglang's efficient inference capabilities to a broader range of hardware.
Objective: Optimize components beyond standard linear layers to further improve performance.
Objective: Stay current with cutting-edge quantization techniques and data formats.
Swipe4057, Hongbosherlock, lambert0312, yuan-luo, xu-yfei and 4 more
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4