Skip to main content

Create Azure OpenAI Instance

You can create Azure OpenAI Instance either using script (ARM or Bicep templates) or manually.

note

Required OpenAI is available in the following regions:

  • Canada East
  • East US

Create using Script

  1. Download the script (ARM or Bicep) from this repo.

  2. Authenticate to azure CLI using az login

  3. [Optional] Skip this step if Kubernetes Cluster is already deployed in Azure. If OpenAI needs to be created in new Resource Group create the resource group using below command, Replace RESOURCE_GROUP_NAME and REGION with correct values.

    az group create \
    --name "RESOURCE_GROUP_NAME" \
    --location "REGION"
  4. Create OpenAI instance using below command, Replace RESOURCE_GROUP_NAME and NAME_OF_OPEN_AI_INSTANCE with correct values. Use the same Managed resource group name that is used while deploying via marketplace.

    az deployment group create \
    --name "Penfield-App-OpenAI" \
    --resource-group "RESOURCE_GROUP_NAME" \
    --template-file mainTemplate.json \
    --parameters accountName=NAME_OF_OPEN_AI_INSTANCE
  5. Retrieve the output and share it with Penfield securely, Replace RESOURCE_GROUP_NAME with correct value.

    az deployment group show \
    --name "Penfield-App-OpenAI" \
    --resource-group "RESOURCE_GROUP_NAME" \
    --query "properties.outputs"

Create Manually

  1. Go to Azure AI service >> Azure OpenAI
  2. Click Create and configure the details including region of the instance (where your ML model deployment is hosted).
  3. After an instance is created, go to the instance by clicking it on the list e.g. in the screenshot click “ca-inst”.
  4. Click manage deployment on the left sidebar and click the “Manage Deployment” button.
  5. Click “Create new deployment” to select a model. You need to create two deployments for text-embedding-ada-002 and gpt-35-turbo models. Use the settings as per the screenshot. You can any name for the deployment but this is needed later on when we will configure the app.
    1. text-embedding-ada-002: Use model name: text-embedding-ada-002 , model version: 2 and Deployment type: Standard. Rate limit can be set to default value, should be atleast 350K TPM.
    2. gpt-4o: Use model name: gpt-4o, model version: 2024-05-13 and Deployment type: GlobalStandard. Rate limit can be set to default value, should be atleast 200K TPM.
    Note: The availability of models varies based on the region and subscription.
  6. Once a model deployment is created (like the 2 deployments on the screenshot above), you will be able to access the model from an endpoint. Go back to the instance created in step 2 and 3. Click “keys and Endpoint” on the side panel to get the API access.
  7. Keep the following information handy for the deployment in the next steps:
    1. AZURE_OPENAI_ENDPOINT: Endpoint URL from step 6.
    2. AZURE_OPENAI_DEPLOYMENT: Deployment name of gpt-35-turbo model from step 5.
    3. AZURE_OPENAI_MODEL: Deployment model like gpt-35-turbo from step 5.
    4. AZURE_OPENAI_API_KEY: Key from step 6, Only one key is needed.

All the required infrastructure components will be deployed at this stage, please proceed to deploy the application