A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/sgl-project/sglang/issues/8177 below:

[Bug] PD disaggregation Abort Issue · Issue #8177 · sgl-project/sglang · GitHub

Checklist Describe the bug

When I conducted a stability stress test with 400 concurrent requests of varying lengths, the unbootstrap queue in the prefill stage kept piling up. Additionally, the token_usage on the decode side approached 1. Due to the timeout mechanism in our link (e.g., automatically disconnecting when TTFT exceeds 60s), the business layer actively terminates the connection with the sglang HTTP server. However, it appears that the decode side fails to release resources properly, leading to the accumulation of unbootstrap tasks in the prefill stage and eventually causing the decode side to hang.

Decode Prefill Reproduction

Look Above

Environment

H20
Prefill 2-node
Decode 4-node


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4