Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI scale with APIM #40250

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
title: Scale Azure OpenAI for .NET with Azure API Management
description: Learn how to add load balancing to your .NET application to extend the chat app beyond the Azure OpenAI token and model quota limits with Azure API Management.
ms.date: 03/28/2024
ms.topic: get-started
ms.custom: devx-track-dotnet, devx-track-dotnet-ai
# CustomerIntent: As a .NET developer new to Azure OpenAI, I want to scale my Azure OpenAI capacity to avoid rate limit errors with Azure API Management.
---

# Scale Azure OpenAI for .NET chat using RAG with Azure API Management

[!INCLUDE [aca-load-balancer-intro](~/azure-dev-docs-pr/articles/intro/includes/scaling-load-balancer-introduction-azure-api-management.md)]

## Prerequisites

* Azure subscription. [Create one for free](https://azure.microsoft.com/free/ai-services?azure-portal=true)

Check failure on line 16 in docs/ai/get-started-app-chat-scaling-with-azure-api-management.md

View workflow job for this annotation

GitHub Actions / lint

Trailing spaces [Expected: 0 or 2; Actual: 1]
diberry marked this conversation as resolved.
Show resolved Hide resolved
* Access granted to Azure OpenAI in the desired Azure subscription.

Currently, access to this service is granted only by application. You can apply for access to Azure OpenAI by completing the form at https://aka.ms/oai/access.

Check failure on line 19 in docs/ai/get-started-app-chat-scaling-with-azure-api-management.md

View workflow job for this annotation

GitHub Actions / lint

Bare URL used [Context: "https://aka.ms/oai/access"]

* [Dev containers](https://containers.dev/) are available for both samples, with all dependencies required to complete this article. You can run the dev containers in GitHub Codespaces (in a browser) or locally using Visual Studio Code.

#### [Codespaces (recommended)](#tab/github-codespaces)

Check failure on line 23 in docs/ai/get-started-app-chat-scaling-with-azure-api-management.md

View workflow job for this annotation

GitHub Actions / lint

Headings must start at the beginning of the line [Context: " #### [Codespaces (recommen..."]

Check failure on line 24 in docs/ai/get-started-app-chat-scaling-with-azure-api-management.md

View workflow job for this annotation

GitHub Actions / lint

Trailing spaces [Expected: 0 or 2; Actual: 4]
* Only a [GitHub account](https://www.github.com/login) is required to use Codespaces

Check failure on line 25 in docs/ai/get-started-app-chat-scaling-with-azure-api-management.md

View workflow job for this annotation

GitHub Actions / lint

Unordered list indentation [Expected: 2; Actual: 4]

Check failure on line 26 in docs/ai/get-started-app-chat-scaling-with-azure-api-management.md

View workflow job for this annotation

GitHub Actions / lint

Trailing spaces [Expected: 0 or 2; Actual: 4]
#### [Visual Studio Code](#tab/visual-studio-code)

Check failure on line 27 in docs/ai/get-started-app-chat-scaling-with-azure-api-management.md

View workflow job for this annotation

GitHub Actions / lint

Headings should be surrounded by blank lines [Expected: 1; Actual: 0; Below] [Context: "#### [Visual Studio Code](#tab/visual-studio-code)"]

Check failure on line 27 in docs/ai/get-started-app-chat-scaling-with-azure-api-management.md

View workflow job for this annotation

GitHub Actions / lint

Headings must start at the beginning of the line [Context: " #### [Visual Studio Code](..."]
* [Azure Developer CLI](../azure-developer-cli/install-azd.md?tabs=winget-windows%2Cbrew-mac%2Cscript-linux&pivots=os-windows)

Check failure on line 28 in docs/ai/get-started-app-chat-scaling-with-azure-api-management.md

View workflow job for this annotation

GitHub Actions / lint

Unordered list indentation [Expected: 2; Actual: 4]
* [Docker Desktop](https://www.docker.com/products/docker-desktop/) - start Docker Desktop if it's not already running

Check failure on line 29 in docs/ai/get-started-app-chat-scaling-with-azure-api-management.md

View workflow job for this annotation

GitHub Actions / lint

Unordered list indentation [Expected: 2; Actual: 4]
* [Visual Studio Code](https://code.visualstudio.com/) with [Dev Container Extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)

---

[!INCLUDE [scaling-load-balancer-aca-procedure.md](../intro/includes/scaling-load-balancer-procedure-azure-api-management.md)]

[!INCLUDE [deployment-procedure](../intro/includes/redeploy-procedure-chat-azure-api-management.md)]

[!INCLUDE [capacity.md](../intro/includes/scaling-load-balancer-capacity.md)]

[!INCLUDE [py-apim-cleanup](../intro/includes/scaling-load-balancer-cleanup-azure-api-management.md)]

## Sample code

Samples used in this article include:

* [.NET chat app with RAG](https://github.com/Azure-Samples/azure-search-openai-demo-csharp)
* [Load Balancer with Azure API Management](https://github.com/Azure-Samples/openai-apim-lb)

## Next step
diberry marked this conversation as resolved.
Show resolved Hide resolved

* [View Azure API Management diagnostic data in Azure Monitor](/azure/api-management/api-management-howto-use-azure-monitor#view-diagnostic-data-in-azure-monitor)
diberry marked this conversation as resolved.
Show resolved Hide resolved
* Use [Azure Load Testing](/azure/load-testing/) to load test your chat app with
diberry marked this conversation as resolved.
Show resolved Hide resolved
Loading