Showing content from https://github.com/sgl-project/sglang/issues/8199 below:
[Roadmap] Supporting multi frameworks on 2025 H2 · Issue #8199 · sgl-project/sglang · GitHub
Background
Last week we issued an RFC for supporting frameworks other than PyTorch, with priority on supporting MindSpore on Ascend NPU: #7941.
After some discussions, we are issuing the tentative Roadmap for 2025 H2.
Overall Design
- [July] Proof of concept of PyTorch/MindSpore coexistence
- [July] Design Documentation
Inter-Framework Compatibility
- [August] Tensor memory sharing through DLPack
- [August] Compatibility of PyTorch/MindSpore distributed environment
- [August] Resource sharing of PyTorch/MindSpore (stream reuse, memory pool, etc.)
MindSpore Model Support
- [August] Radix Attention support on NPU
- [August] Qwen3 dense models
- [September] Qwen3 MoE model
- [September] DeepSeek V3/R1 model family
SGLang Features
- [September] Combinations of Data/Tensor/Pipeline/Expert Parallels
- [September] Speculative Decoding
- [September] PD disaggregation
- [September] Quantization
- [September] LoRA
CI on Ascend NPU
User / Developer Experience
- [August] Benchmark results and profiling tools
- [August] Docker image
- [September] Documentations (installation, quickstart, tutorials, etc.)
Long-Term Plans
- [Q4] Further optimizations and more MindSpore models
Comments and suggestions are welcome!
RetroSearch is an open source project built by @garambo
| Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4