A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://www.redhat.com/en/products/ai/inference-server below:

Red Hat AI Inference Server

Red Hat AI Inference Server

Red Hat® AI Inference Server optimizes model inference across the hybrid cloud for faster, cost-effective model deployments. 

What is an inference server?

An inference server is the piece of software that allows artificial intelligence (AI) applications to communicate with large language models (LLMs) and generate a response based on data. This process is called inference. It’s where the business value happens and the end result is delivered.

To perform effectively, LLMs need extensive storage, memory, and infrastructure to inference at scale—which is why it can take the majority of your budget. 

As part of the Red Hat AI platform, Red Hat AI Inference Server optimizes inference capabilities to drive down traditionally high costs and extensive infrastructure. 

Introduction to Red Hat AI Inference Server

How does Red Hat AI Inference Server work?

Red Hat AI Inference Server provides fast and cost-effective inference at scale. Its open source nature allows it to support any generative AI (gen AI) model, on any AI accelerator, in any cloud environment. 

Powered by vLLM, the inference server maximizes GPU utilization, and enables faster response times. Combined with LLM Compressor capabilities, inference efficiency increases without sacrificing performance. With cross-platform adaptability and a growing community of contributors, vLLM is emerging as the Linux® of gen AI inference. 

50%

Some customers who used LLM Compressor experienced 50% cost savings without sacrificing performance.* 

Your models are your choice

Red Hat AI Inference Server supports all leading open source models and maintains flexible GPU portability. You have the flexibility to use any gen AI model and choose from our optimized collection of validated, open source, third-party models.  

Plus, as part of Red Hat AI, Red Hat AI Inference Server is certified for all Red Hat products. It can also be deployed across other Linux and Kubernetes platforms with support under Red Hat’s third-party support policy

Icon-Red_Hat-Diagrams-Graph_Arrow_Up-A-Black-RGB

Increased efficiency with vLLM

Optimize the deployment of any gen AI model, on any AI accelerator, with vLLM.

Icon-Red_Hat-Thumbs_up-A-Black-RGB

LLM Compressor

Compress models of any size to reduce compute utilization and its related costs while maintaining high model response accuracy. 

Icon-Red_Hat-Simplify-A-Black-RGB

Hybrid cloud flexibility

Maintain portability across different GPUs and run models on premise, in the cloud, or at the edge.

Icon-Red_Hat-Software-Catalog-A-Black-RGB

Red Hat AI repository

Third-party validated and optimized models are ready for inference deployment, to help achieve faster time to value and to keep costs low.

Red Hat AI Support

As one of the largest commercial contributors to vLLM, we have a deep understanding of the technology. Our AI consultants have the vLLM expertise to help you achieve your enterprise AI goals. 

How to buy

Red Hat AI Inference Server is available as a standalone product, or as part of Red Hat AI. It is included in both Red Hat Enterprise Linux® AI and Red Hat OpenShift® AI. 

Deploy with partners

Experts and technologies are coming together so our customers can do more with AI. Explore all of the partners working with Red Hat to certify their operability with our solutions. 

Frequently asked questions

No. You can purchase Red Hat AI Inference Server as a standalone Red Hat product. 

No. Red Hat AI Inference Server is included when you purchase Red Hat Enterprise Linux AI as well as Red Hat OpenShift AI. 

Yes, it can. It can also run on third-party Linux environments under our third-party agreement.

It is priced per accelerator.

Explore more AI resources

How to get started with AI at the enterprise

Get Red Hat Consulting for AI

Maximize AI innovation with open source models

Red Hat Consulting: AI Platform Foundation

Contact Sales Talk to a Red Hatter about Red Hat AI


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4