Showing content from https://github.com/sgl-project/sglang/issues/8210 below:
[Roadmap] Distributed Serving Enhancement on 2025 H2 · Issue #8210 · sgl-project/sglang · GitHub
Roadmap of Distributed Serving Enhancement on 2025 H2
- P/D Disaggregated Serving @ShangmingCai
- Global KVCache Pool @ykwd
- RAS (Reliability, Availability, Serviceability)
- Implement P/D health monitoring and fast reconfiguration leveraging disaggregated architecture. @ShangmingCai @whybeyoung
- Introduce Elastic EP and cooperate with EPLB to tolerate partial GPU failures during inference. @UNIDY2002 @HanHan009527
- Implement fine-grained profiling for PD with EP/DP/PP. @stmatengss
hzh0425, Swipe4057, yizhang2077, zhyncs, Atream and 20 morehzh0425, Swipe4057, yizhang2077, zhyncs, yiakwy-xpu-ml-framework-team and 3 moressssnow, ShangmingCai, Swipe4057, zhyncs, ch-wan and 3 more
RetroSearch is an open source project built by @garambo
| Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4