Skip to main content

Challenge 47: Azure Content Understanding — Multimodal Analysis

Estimated Time

45-60 min | Cost: ~$1.50 (AI Services transactions) | Domain: Knowledge Mining & Extraction (15-20%)

Service Status

Azure Content Understanding is a newer Azure AI capability. Check documentation for the latest API versions and availability. This challenge uses REST API patterns aligned with the documented interface.

Exam skills covered

SkillWeight
Create an Azure Content Understanding analyzerHigh
Summarize and classify documentsHigh
Extract entities from documentsMedium
Extract tables and key-value pairsMedium
Process images and extract structured dataMedium

Overview

Azure Content Understanding provides multimodal analysis capabilities that go beyond traditional OCR and Document Intelligence:

CapabilityDescription
Document summarizationGenerate concise summaries of long documents
Document classificationClassify documents into custom categories
Entity extractionExtract custom entities with LLM understanding
Table extractionExtract structured tables from documents
Image analysisUnderstand and describe image content
Multi-format supportPDF, images, Office documents, audio, video

Key concepts

  • Analyzer: A configured pipeline that defines what to extract from content
  • Field: A specific piece of information to extract (with type and description)
  • Schema: The expected output structure for the analyzer

Prerequisites

  • Azure subscription with Contributor role
  • Azure AI Services resource (multi-service account) in a supported region
  • Python 3.9+ with requests library (REST-based)
  • .NET 8 with HttpClient
  • Sample documents (PDF, images)

Implementation

Task 1: Create an Azure AI Services resource

RG="rg-ai102-content"
LOCATION="eastus"
AI_SERVICE="ai-content-$(openssl rand -hex 4)"

az group create --name $RG --location $LOCATION

# Create Azure AI Services (multi-service) resource
az cognitiveservices account create \
--name $AI_SERVICE \
--resource-group $RG \
--location $LOCATION \
--kind AIServices \
--sku S0 \
--yes

AI_ENDPOINT=$(az cognitiveservices account show \
--name $AI_SERVICE --resource-group $RG \
--query "properties.endpoint" -o tsv)

AI_KEY=$(az cognitiveservices account keys list \
--name $AI_SERVICE --resource-group $RG \
--query "key1" -o tsv)

Task 2: Create a document analyzer with custom fields

import requests
import json
import time

endpoint = AI_ENDPOINT.rstrip("/")
api_key = AI_KEY
api_version = "2024-12-01-preview"

# Create an analyzer for extracting invoice data
analyzer_definition = {
"description": "Invoice analyzer with summarization and entity extraction",
"scenario": "document",
"fieldSchema": {
"fields": {
"Summary": {
"type": "string",
"description": "A brief summary of the document content"
},
"DocumentType": {
"type": "string",
"description": "The type of document (invoice, receipt, contract, etc.)"
},
"VendorName": {
"type": "string",
"description": "The name of the vendor or sender"
},
"TotalAmount": {
"type": "number",
"description": "The total monetary amount"
},
"Currency": {
"type": "string",
"description": "The currency code (USD, EUR, etc.)"
},
"KeyEntities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"category": {"type": "string"}
}
},
"description": "Key entities mentioned in the document"
}
}
}
}

# Create the analyzer
analyzer_id = "invoice-analyzer"
url = f"{endpoint}/contentunderstanding/analyzers/{analyzer_id}?api-version={api_version}"

response = requests.put(
url,
headers={
"Ocp-Apim-Subscription-Key": api_key,
"Content-Type": "application/json"
},
json=analyzer_definition
)

if response.status_code in [200, 201]:
print(f"Analyzer '{analyzer_id}' created successfully")
print(json.dumps(response.json(), indent=2))
else:
print(f"Error: {response.status_code} - {response.text}")

Task 3: Analyze a document

# Submit document for analysis
analyze_url = f"{endpoint}/contentunderstanding/analyzers/{analyzer_id}:analyze?api-version={api_version}"

# Analyze from URL
analyze_request = {
"url": "https://raw.githubusercontent.com/Azure/azure-sdk-for-python/main/sdk/documentintelligence/azure-ai-documentintelligence/samples/sample_forms/forms/Invoice_1.pdf"
}

response = requests.post(
analyze_url,
headers={
"Ocp-Apim-Subscription-Key": api_key,
"Content-Type": "application/json"
},
json=analyze_request
)

if response.status_code == 202:
operation_url = response.headers["Operation-Location"]
print(f"Analysis started. Polling: {operation_url}")

# Poll for results
while True:
time.sleep(5)
poll_response = requests.get(
operation_url,
headers={"Ocp-Apim-Subscription-Key": api_key}
)
status_data = poll_response.json()
status = status_data.get("status", "unknown")
print(f" Status: {status}")

if status == "succeeded":
result = status_data.get("result", {})
print("\n=== Analysis Results ===")
contents = result.get("contents", [])
for content in contents:
fields = content.get("fields", {})
for field_name, field_data in fields.items():
value = field_data.get("value", "N/A")
confidence = field_data.get("confidence", 0)
print(f" {field_name}: {value} (confidence: {confidence:.2%})")
break
elif status == "failed":
print(f"Analysis failed: {status_data}")
break
else:
print(f"Error: {response.status_code} - {response.text}")

Task 4: Create an image analyzer

# Create analyzer for image content
image_analyzer = {
"description": "Image content analyzer for product photos",
"scenario": "image",
"fieldSchema": {
"fields": {
"Description": {
"type": "string",
"description": "Detailed description of the image content"
},
"ObjectsDetected": {
"type": "array",
"items": {"type": "string"},
"description": "List of objects visible in the image"
},
"DominantColors": {
"type": "array",
"items": {"type": "string"},
"description": "Dominant colors in the image"
},
"TextContent": {
"type": "string",
"description": "Any text visible in the image"
}
}
}
}

img_analyzer_id = "image-analyzer"
url = f"{endpoint}/contentunderstanding/analyzers/{img_analyzer_id}?api-version={api_version}"

response = requests.put(
url,
headers={
"Ocp-Apim-Subscription-Key": api_key,
"Content-Type": "application/json"
},
json=image_analyzer
)
print(f"Image analyzer created: {response.status_code}")

Task 5: Manage analyzers (list, get, delete)

# List all analyzers
list_url = f"{endpoint}/contentunderstanding/analyzers?api-version={api_version}"
response = requests.get(list_url, headers={"Ocp-Apim-Subscription-Key": api_key})
analyzers = response.json().get("value", [])
print("Analyzers:")
for a in analyzers:
print(f" - {a['analyzerId']}: {a.get('description', 'N/A')}")

# Get analyzer details
get_url = f"{endpoint}/contentunderstanding/analyzers/{analyzer_id}?api-version={api_version}"
response = requests.get(get_url, headers={"Ocp-Apim-Subscription-Key": api_key})
print(f"\nAnalyzer details: {json.dumps(response.json(), indent=2)}")

# Delete an analyzer
delete_url = f"{endpoint}/contentunderstanding/analyzers/{analyzer_id}?api-version={api_version}"
response = requests.delete(delete_url, headers={"Ocp-Apim-Subscription-Key": api_key})
print(f"Delete status: {response.status_code}")

Expected Output

{
"status": "succeeded",
"result": {
"contents": [
{
"fields": {
"Summary": {
"value": "Invoice #INV-001 from Contoso Ltd for consulting services totaling $3,800.00 USD, due February 15, 2024.",
"confidence": 0.92
},
"DocumentType": {
"value": "invoice",
"confidence": 0.98
},
"VendorName": {
"value": "Contoso Ltd",
"confidence": 0.95
},
"TotalAmount": {
"value": 3800.00,
"confidence": 0.93
},
"Currency": {
"value": "USD",
"confidence": 0.97
}
}
}
]
}
}

Break & fix

#ScenarioSymptomRoot CauseFix
1Analyzer creation failsHTTP 400 with schema validation errorField type not valid (e.g., using "int" instead of "number")Use supported types: string, number, boolean, array, object
2Analysis returns low confidenceFields extracted with < 50% confidenceField description is too vague for the AI modelWrite more specific field descriptions that guide the extraction
3"Resource not found" on analysisHTTP 404Analyzer ID misspelled or region doesn't support Content UnderstandingVerify analyzer ID and check regional availability
4Timeout on large documentsOperation never completesDocument exceeds size limitsCheck file size limits; split large documents
5Empty results for imageAll fields return nullUsing "document" scenario for image contentUse "image" scenario for image files

Knowledge Check

1. You need to extract custom fields from documents where the schema changes per client. Which Azure service lets you define extraction fields with natural language descriptions?

2. What is the primary difference between Azure Content Understanding and Azure Document Intelligence for document processing?

3. You create a Content Understanding analyzer with scenario 'document'. A user submits an image file. What happens?

4. How does Content Understanding determine what data to extract from a document?

5. What HTTP status code indicates the analysis operation has been accepted and is processing asynchronously?

Cleanup

az group delete --name rg-ai102-content --yes --no-wait

Learn More