A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/sgl-project/sglang/issues/8151 below:

[Roadmap] Kimi-K2 performance enhancement on H20 GPU · Issue #8151 · sgl-project/sglang · GitHub

[Proposal] Kimi-K2 performance enhancement on H20 GPU Summary

Our current test found that the performance of Kimi k2 under TP16 is very poor, in the input and output 3500/1500 scenarios, to meet the SLO for TTFT < 5s and TPOT < 50ms single card total throughput can only reach 36 token/s, so determine the plan aims to quickly improve the performance of Kimi k2 on H20 hardware, fix the bugs in the process, and give the best practices.

Roadmap

HanHan009527, zhyncs, YangQun1, hzh0425, artetaout and 9 more


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4