After you create a custom model, you can set up inference using one of the following options:
Purchase Provisioned Throughput â Purchase Provisioned Throughput for your model to set up dedicated compute capacity with guaranteed throughput for consistent performance and lower latency.
For more information about Provisioned Throughput, see Increase model invocation capacity with Provisioned Throughput in Amazon Bedrock. For more information about using custom models with Provisioned Throughput, see Purchase Provisioned Throughput for a custom model.
Deploy custom model for on-demand inference (only Amazon Nova models) â To set up on-demand inference, you deploy the model with a custom model deployment. After you deploy the model, you invoke it using the ARN for the custom model deployment. With on-demand inference, you only pay for what you use and you don't need to set up provisioned compute resources.
For more information about deploying custom models for on-demand inference, see Deploy a custom model for on-demand inference.
View details about a custom model
Purchase Provisioned Throughput for a custom model
Did this page help you? - Yes
Thanks for letting us know we're doing a good job!
If you've got a moment, please tell us what we did right so we can do more of it.
Did this page help you? - No
Thanks for letting us know this page needs work. We're sorry we let you down.
If you've got a moment, please tell us how we can make the documentation better.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4