A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://docs.aws.amazon.com/elastic-inference/latest/developerguide/ei-pytorch-using.html below:

Deploy models for inference - Amazon SageMaker AI

Deploy models for inference

With Amazon SageMaker AI, you can start getting predictions, or inferences, from your trained machine learning models. SageMaker AI provides a broad selection of ML infrastructure and model deployment options to help meet all your ML inference needs. With SageMaker AI Inference, you can scale your model deployment, manage models more effectively in production, and reduce operational burden. SageMaker AI provides you with various inference options, such as real-time endpoints for getting low latency inference, serverless endpoints for fully managed infrastructure and auto-scaling, and asynchronous endpoints for batches of requests. By leveraging the appropriate inference option for your use case, you can ensure efficient model deployment and inference.

Choosing a feature

There are several use cases for deploying ML models with SageMaker AI. This section describes those use cases, as well as the SageMaker AI feature we recommend for each use case.

Use cases

The following are the main uses cases for deploying ML models with SageMaker AI.

Recommended features

The following table describes key considerations and tradeoffs for SageMaker AI features corresponding with each use case.

Use case 1 Use case 2 Use case 3 SageMaker AI feature Use JumpStart in Studio to accelerate your foundational model deployment. Deploy models using ModelBuilder from the SageMaker Python SDK. Deploy and manage models at scale with AWS CloudFormation. Description Use the Studio UI to deploy pre-trained models from a catalog to pre-configured inference endpoints. This option is ideal for citizen data scientists, or for anyone who wants to deploy a model without configuring complex settings. Use the ModelBuilder class from the Amazon SageMaker AI Python SDK to deploy your own model and configure deployment settings. This option is ideal for experienced data scientists, or for anyone who has their own model to deploy and requires fine-grained control. Use AWS CloudFormation and Infrastructure as Code (IaC) for programmatic control and automation for deploying and managing SageMaker AI models. This option is ideal for advanced users who require consistent and repeatable deployments. Optimized for Fast and streamlined deployments of popular open source models Deploying your own models Ongoing management of models in production Considerations Lack of customization for container settings and specific application needs No UI, requires that you're comfortable developing and maintaining Python code Requires infrastructure management and organizational resources, and also requires familiarity with the AWS SDK for Python (Boto3) or with AWS CloudFormation templates. Recommended environment A SageMaker AI domain A Python development environment configured with your AWS credentials and the SageMaker Python SDK installed, or a SageMaker AI IDE such as SageMaker JupyterLab The AWS CLI, a local development environment, and Infrastructure as Code (IaC) and CI/CD tools Additional options

SageMaker AI provides different options for your inference use cases, giving you choice over the technical breadth and depth of your deployments:


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4