feat(sagemaker): add Endpoint L2 construct

This is the third and final PR to complete the implementation of RFC 431: aws/aws-cdk-rfcs#431 closes aws#2809 Co-authored-by: Matt McClean <mmcclean@amazon.com> Co-authored-by: Long Yao <yl1984108@gmail.com> Co-authored-by: Drew Jetter <60628154+jetterdj@users.noreply.github.com> Co-authored-by: Murali Ganesh <59461079+foxpro24@users.noreply.github.com> Co-authored-by: Abilash Rangoju <988529+rangoju@users.noreply.github.com>
petermeansrock · Nov 11, 2022 · a139468 · a139468
1 parent 0e97c15
commit a139468
Show file tree

Hide file tree

Showing 21 changed files with 4,513 additions and 0 deletions.
diff --git a/packages/@aws-cdk/aws-sagemaker/README.md b/packages/@aws-cdk/aws-sagemaker/README.md
@@ -195,3 +195,60 @@ const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
   ]
 });
 ```
+
+### Endpoint
+
+When you create an endpoint from an `EndpointConfig`, Amazon SageMaker launches the ML compute
+instances and deploys the model or models as specified in the configuration. To get inferences from
+the model, client applications send requests to the Amazon SageMaker Runtime HTTPS endpoint. For
+more information about the API, see the
+[InvokeEndpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/API_runtime_InvokeEndpoint.html)
+API. Defining an endpoint requires at minimum the associated endpoint configuration:
+
+```typescript
+import * as sagemaker from '@aws-cdk/aws-sagemaker';
+
+declare const endpointConfig: sagemaker.EndpointConfig;
+
+const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
+```
+
+### AutoScaling
+
+To enable autoscaling on the production variant, use the `autoScaleInstanceCount` method:
+
+```typescript
+import * as sagemaker from '@aws-cdk/aws-sagemaker';
+
+declare const endpointConfig: sagemaker.EndpointConfig;
+
+const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
+const productionVariant = endpoint.findInstanceProductionVariant('variantName');
+const instanceCount = productionVariant.autoScaleInstanceCount({
+  maxCapacity: 3
+});
+instanceCount.scaleOnInvocations('LimitRPS', {
+  maxRequestsPerSecond: 30,
+});
+```
+
+For load testing guidance on determining the maximum requests per second per instance, please see
+this [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-scaling-loadtest.html).
+
+### Metrics
+
+To monitor CloudWatch metrics for a production variant, use one or more of the metric convenience
+methods:
+
+```typescript
+import * as sagemaker from '@aws-cdk/aws-sagemaker';
+
+declare const endpointConfig: sagemaker.EndpointConfig;
+
+const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
+const productionVariant = endpoint.findInstanceProductionVariant('variantName');
+productionVariant.metricModelLatency().createAlarm(this, 'ModelLatencyAlarm', {
+  threshold: 100000,
+  evaluationPeriods: 3,
+});
+```