A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/sgl-project/sglang/releases below:

Website Navigation


Releases · sgl-project/sglang · GitHub

v0.4.10 Highlights

This is a regular release with many new optimizations, features, and fixes. Please checkout the following exciting roadmaps and blogs

What's Changed Read more Release v0.4.8 Highlights OpenAI-Compatible Server Refactor

Re-structured the OpenAI-compatible server to support production and enterprise environments. Key improvements include:

This work was a collaborative effort involving engineers from academic and industry institutions. Special thanks to the Oracle Cloud team and the SGLang team and community — including @slin1237, @CatherineSue, @key4ng, @JustinTong0323, @jhinpan, @yhyang201, @woodx9 and @whybeyoung — for their invaluable contributions.

DeepSeek R1 FP4 on Blackwell GPU

Added support for DeepSeek R1 with FP4 and MTP on NVIDIA Blackwell GPU.

Further optimization in progress. Special thanks to the FlashInfer, NVIDIA Enterprise Products, Novita AI, DataCrunch, Google Cloud, and SGLang teams — especially @Alcanderian and @pyc96 — for their critical contributions.

Breaking Change: OpenAI-Compatible API Module Moved

The sglang/srt/openai_api directory has been removed and replaced with sglang/srt/entrypoints/openai.

Update your imports to the new module path. For example:

- from sglang.srt.openai_api.protocol import Tool
+ from sglang.srt.entrypoints.openai.protocol import Tool
What's Changed Read more Release v0.4.7 Highlights

We extend our sincere thanks to the following contributors, listed in alphabetical order: Alibaba Cloud, AMD Team, Ant Group, Baseten Team, Cursor Team, Dynamo Team, EAGLE Team, FlashInfer Team, Google Vertex AI Team, iFlytek MaaS Team, Intel Team, LinkedIn Team, Meituan Team, Microsoft Copilot Team, Mooncake Team, NVIDIA Team, Oracle Team, Qwen Team, Voltage Park Team and open source community users. Your support and collaboration are deeply appreciated!

What's Changed Read more Release v0.4.6 Highlights

Thanks very much to LinkedIn team, Alibaba Cloud, Mooncake team, NVIDIA Team, AMD Team, Pytorch Team, Ant Group, Baseten Team, Oracle Team, Meituan Team, iFlytek MaaS team and the open source community users for their contributions!

We’re thrilled about these advancements and eager to hear your feedback! Join us on our Slack channel at slack.sglang.ai to connect and share your thoughts. Cheers!

Coming Soon What's Changed Read more Release v0.4.5 Highlights

The SGLang team is excited to the release of v0.4.5! This version introduces several significant features, including Llama 4 support, FlashAttention 3 backend, EAGLE3 speculative decoding, DeepEP integration, and disaggregated prefill and decoding.

New Features

Thanks very much to the NVIDIA team, LinkedIn team, EAGLE team, Oracle team, Meituan team, and our incredible open-source community for their invaluable contributions!

Coming Soon

We’re thrilled about these advancements and eager to hear your feedback! Join us on our Slack channel at slack.sglang.ai to connect and share your thoughts. Cheers!

What's Changed Read more Release v0.4.4 Highlights

The SGLang team is excited to announce the release of v0.4.4. We will keep improving DeepSeek V3/R1 performance. With the combination of FlashInfer, MTP, DeepGEMM, and Torch Compile optimizations on H200, it can achieve nearly 100 tokens/s, which is currently the fastest open-source implementation. Look out for new optimizations coming soon!

Thanks very much to xAI Team, NVIDIA Team, AMD Team, LinkedIn team, Baseten Team, Oracle Team, Meituan Team and the open source community users for their contributions!

Regarding the use of SGLang for DeepSeek R1 inference acceleration, in addition to the users mentioned in the announcement, there are also teams such as Tencent and Ant Group. We are very happy to have received recognition and usage from these teams!

Though surely there will be bugs and fixes that we'll be discovering and quickly patching in the coming days, including today :) Let's build and ship. Please feel free to join our Slack channel https://slack.sglang.ai/ Cheers!

Optimizations Coming soon What's Changed Read more v0.4.3 Highlights

The SGLang team is excited to announce the release of v0.4.3. We will keep improving DeepSeek V3/R1 performance. In the last six weeks, SGLang has been the fastest engine running DeepSeek V3/R1 among all open-source LLM inference engines. We stay ahead by integrating FlashInfer MLA and optimizing further. Look out for new optimizations coming soon! Please feel free to join our Slack channel https://slack.sglang.ai Cheers!

Performance Improvements DeepSeek V3/R1 Optimizations Architecture Enhancements New Features What's Changed Read more Release v0.4.1 Release v0.4.0 Release v0.3.6 Highlights What's Changed Read more

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4