Skip to main content

Challenge 06: Container Deployment for AI

Estimated Time

45-60 min | Cost: ~$0.50 (container still bills to Azure) | Domain: Plan & Manage AI Solutions (20-25%)

Exam skills covered

  • Plan and implement container deployment for Azure AI services
  • Configure disconnected containers for edge scenarios
  • Manage container licensing and billing requirements

Overview

Azure AI services can be deployed as Docker containers, enabling you to run AI capabilities on-premises, at the edge, or in disconnected environments. While the containers run locally, they still require a connection to Azure for billing purposes unless configured for disconnected mode.

Container deployment is critical for scenarios requiring data residency, low latency, or intermittent connectivity. You'll work with language detection and sentiment analysis containers, learning how to pull images from Microsoft Container Registry, configure billing endpoints, and validate container health.

Understanding the billing model is essential — containers must phone home to Azure for metering even when processing data locally, unless you've obtained a disconnected container license through a gating process.

Architecture

The container deployment connects local Docker containers to Azure AI services for billing while processing data locally.

Challenge 06 topology

Prerequisites

  • Azure subscription with an Azure AI services multi-service resource
  • Docker Desktop installed and running
  • Azure CLI installed
  • At least 8 GB RAM available for containers
  • Endpoint and key from your Azure AI services resource

Implementation

Task 1: Create Azure AI Services Resource for Billing

from azure.identity import DefaultAzureCredential
from azure.mgmt.cognitiveservices import CognitiveServicesManagementClient
from azure.mgmt.cognitiveservices.models import Account, Sku

credential = DefaultAzureCredential()
client = CognitiveServicesManagementClient(credential, subscription_id="<your-subscription-id>")

# Create a multi-service resource for container billing
account = client.accounts.begin_create(
resource_group_name="rg-ai102-challenge06",
account_name="ai-containers-billing",
account=Account(
sku=Sku(name="S0"),
kind="AIServices",
location="eastus",
properties={}
)
).result()

print(f"Resource created: {account.name}")
print(f"Endpoint: {account.properties.endpoint}")

# Retrieve keys for container billing configuration
keys = client.accounts.list_keys(
resource_group_name="rg-ai102-challenge06",
account_name="ai-containers-billing"
)
print(f"Key1: {keys.key1}")

Task 2: Pull and Run Language Detection Container

import subprocess
import requests
import time
import os

ENDPOINT = os.environ["AZURE_AI_ENDPOINT"]
KEY = os.environ["AZURE_AI_KEY"]

# Pull the language detection container
subprocess.run([
"docker", "pull",
"mcr.microsoft.com/azure-cognitive-services/textanalytics/language:latest"
], check=True)

# Run the container with billing configuration
container_id = subprocess.run([
"docker", "run", "-d",
"--name", "language-detection",
"-p", "5000:5000",
"-e", f"Eula=accept",
"-e", f"Billing={ENDPOINT}",
"-e", f"ApiKey={KEY}",
"--memory", "4g",
"--cpus", "2",
"mcr.microsoft.com/azure-cognitive-services/textanalytics/language:latest"
], capture_output=True, text=True, check=True)

print(f"Container started: {container_id.stdout.strip()}")

# Wait for container to be ready
time.sleep(30)

# Health check
response = requests.get("http://localhost:5000/status")
print(f"Container health: {response.json()}")

Task 3: Test Language Detection Container Locally

from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential

# Point the SDK client at the local container
# The container exposes the same API as the cloud service
client = TextAnalyticsClient(
endpoint="http://localhost:5000",
credential=AzureKeyCredential("not-needed-for-local")
)

documents = [
"Hello, this is a test document in English.",
"Bonjour, ceci est un document de test en français.",
"Hola, este es un documento de prueba en español.",
"Dies ist ein Testdokument auf Deutsch.",
"これはテスト用のドキュメントです。"
]

response = client.detect_language(documents=documents)

for doc in response:
if not doc.is_error:
print(f"Text: '{doc.input[:40]}...'")
print(f" Language: {doc.primary_language.name}")
print(f" ISO Code: {doc.primary_language.iso6391_name}")
print(f" Confidence: {doc.primary_language.confidence_score:.2f}")
else:
print(f"Error: {doc.error.message}")

Task 4: Run Sentiment Analysis Container and Configure Docker Compose

import yaml
import os

ENDPOINT = os.environ["AZURE_AI_ENDPOINT"]
KEY = os.environ["AZURE_AI_KEY"]

# Generate docker-compose.yml for multi-container deployment
compose_config = {
"version": "3.8",
"services": {
"language-detection": {
"image": "mcr.microsoft.com/azure-cognitive-services/textanalytics/language:latest",
"ports": ["5000:5000"],
"environment": {
"Eula": "accept",
"Billing": ENDPOINT,
"ApiKey": KEY
},
"deploy": {
"resources": {
"limits": {"cpus": "2", "memory": "4G"},
"reservations": {"cpus": "1", "memory": "2G"}
}
},
"healthcheck": {
"test": ["CMD", "curl", "-f", "http://localhost:5000/status"],
"interval": "30s",
"timeout": "10s",
"retries": 3,
"start_period": "60s"
}
},
"sentiment-analysis": {
"image": "mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest",
"ports": ["5001:5000"],
"environment": {
"Eula": "accept",
"Billing": ENDPOINT,
"ApiKey": KEY
},
"deploy": {
"resources": {
"limits": {"cpus": "2", "memory": "4G"},
"reservations": {"cpus": "1", "memory": "2G"}
}
},
"healthcheck": {
"test": ["CMD", "curl", "-f", "http://localhost:5000/status"],
"interval": "30s",
"timeout": "10s",
"retries": 3,
"start_period": "60s"
}
}
}
}

with open("docker-compose.yml", "w") as f:
yaml.dump(compose_config, f, default_flow_style=False)

print("docker-compose.yml generated")
print("Run: docker compose up -d")

Task 5: Configure Disconnected Container Mode

import subprocess
import os

# Note: Disconnected containers require a commitment plan and gating approval
# from Microsoft. This shows the configuration once approved.

# Download the disconnected container license
# (Obtained from Azure portal after gating approval)
LICENSE_PATH = "/path/to/license/file.lic"

# For disconnected mode, the container is configured with:
# - DownloadLicense=True to download the license initially
# - Mounts: volume for license file persistence

# Step 1: Download license (requires initial connectivity)
print("Step 1: Downloading disconnected container license...")
subprocess.run([
"docker", "run", "-d",
"--name", "language-download-license",
"-e", "Eula=accept",
"-e", f"Billing={os.environ['AZURE_AI_ENDPOINT']}",
"-e", f"ApiKey={os.environ['AZURE_AI_KEY']}",
"-e", "DownloadLicense=True",
"-v", "license-data:/license",
"-e", "Containers:Billing:License:Mount=/license",
"mcr.microsoft.com/azure-cognitive-services/textanalytics/language:latest"
], check=True)

print("License downloaded. Container can now run disconnected.")

# Step 2: Run in disconnected mode (no internet required)
print("\nStep 2: Running in disconnected mode...")
subprocess.run([
"docker", "run", "-d",
"--name", "language-disconnected",
"-p", "5000:5000",
"-e", "Eula=accept",
"-v", "license-data:/license",
"-e", "Containers:Billing:License:Mount=/license",
"--network", "none", # No network access - truly disconnected
"mcr.microsoft.com/azure-cognitive-services/textanalytics/language:latest"
], check=True)

print("Container running in disconnected mode (no network).")
print("License validity period: defined by commitment plan duration.")

Expected Output

// Language detection response from local container
{
"documents": [
{
"id": "1",
"detectedLanguage": {
"name": "English",
"iso6391Name": "en",
"confidenceScore": 1.0
}
},
{
"id": "2",
"detectedLanguage": {
"name": "French",
"iso6391Name": "fr",
"confidenceScore": 1.0
}
}
]
}

// Container health check
{
"service": "Language",
"status": "ready",
"apiStatus": "Valid",
"apiStatusMessage": "Api Key is valid."
}

Break & fix

ScenarioSymptomRoot CauseFix
Container won't startExit code 1, "Billing endpoint required"Missing Billing environment variableAdd -e Billing=<endpoint> to docker run
Container starts but returns 401HTTP 401 on all requestsInvalid or missing API keyVerify ApiKey environment variable matches Azure resource key
Container crashes with OOMExit code 137Insufficient memory allocatedIncrease --memory to at least 4g for text analytics containers
Billing validation fails"Unable to reach billing endpoint"Firewall blocking outbound HTTPSAllow outbound to *.cognitiveservices.azure.com on port 443
Disconnected container license expired"License expired" error on startupCommitment plan period endedReconnect to Azure to renew license, or renew commitment plan

Knowledge Check

1. What is required for an Azure AI container to function in connected mode?

2. What Docker parameter must be set to true when accepting the container license agreement?

3. What is the minimum RAM recommended for running the Text Analytics language detection container?

4. How does a disconnected Azure AI container handle billing?

5. Which endpoint can you use to verify that an Azure AI container is healthy and ready to accept requests?

Cleanup

# Stop and remove containers
docker stop language-detection sentiment-analysis language-disconnected 2>/dev/null
docker rm language-detection sentiment-analysis language-disconnected language-download-license 2>/dev/null

# Remove Docker volume
docker volume rm license-data 2>/dev/null

# Remove Azure resources
az group delete --name rg-ai102-challenge06 --yes --no-wait

Learn More