AWS Sagemaker Inference Endpoint

AWS SageMaker Inference Endpoint for hosting an AI model

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Deployments

42

Made by

Massdriver

Official

Yes

No

Compliance
Tags

aws-sagemaker-inference-endpoint

Amazon SageMaker Inference Endpoint is a fully managed service by AWS that allows developers to deploy machine learning models for making predictions in real-time. This bundle will take model data from S3 and an ECR Image such as Amazon’s Deep Learning Containers (DLC) , create an endpoint configuration, and then create an endpoint.

Use Cases

Real-time Predictions

SageMaker Inference Endpoints make it easy to generate real-time predictions from your machine learning models.

Scalable Model Deployment

You can deploy your models with auto scaling capabilities to handle varying loads of inference requests.

Integrated with your Applications

SageMaker Inference Endpoints are exposed through a secure REST API, which can be easily integrated into your applications.

Design

This bundle accepts a SageMaker Model name as input and creates an Inference Endpoint. The model must be in the SageMaker Model Registry before it can be used to create an endpoint. The endpoint can be deployed to a variety of instance types, including CPU and GPU instances and you can set it’s initial instance count. The endpoint must be created in the same region as the model.

SageMaker Model

A SageMaker Model is a representation of a machine learning model. It includes the S3 path where the model artifacts are stored and the Docker image that was used for training.

Endpoint Configuration

An Endpoint Configuration specifies the ML compute instances that will be used for the inference endpoint.

Inference Endpoint

The Inference Endpoint is a hosted model that can be accessed through a REST API to get real-time predictions.

Variable Type Description
endpoint_config.instance_count integer Initial number of instances used for auto-scaling.
endpoint_config.instance_type string Instance type to use for the SageMaker endpoint
endpoint_config.primary_container.ecr_image string The ECR Image URI. (e.g. 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.1.0-gpu-py310-cu118-ubuntu20.04-ec2)
endpoint_config.primary_container.model_data_config.enabled boolean Enabling this option will allow you to include model data for the SageMaker model.
environment_variables[].name string No description
environment_variables[].value string No description
monitoring.endpoint_log_retention integer No description
No items found.