Desafio 20: Orquestração Multi-Modelo
45-60 min | Custo: ~$3.00 (estimado) | Domínio: Soluções de IA Generativa (15-20%)
Habilidades do exame cobertas
- Implementar orquestração de múltiplos modelos de IA generativa
- Implantar modelos em contêineres para cenários de borda
- Implementar function calling para uso de ferramentas
Visão Geral
Sistemas de IA em produção raramente dependem de um único modelo. Orquestração multi-modelo roteia requisições para diferentes modelos com base na complexidade da tarefa, restrições de custo ou requisitos de capacidade. Por exemplo, um roteador pode enviar tarefas simples de classificação para o GPT-4o-mini (rápido, barato) enquanto direciona raciocínio complexo para o GPT-4o (mais lento, mais capaz). Esse padrão otimiza o tradeoff custo-qualidade em um portfólio de aplicações.
Semantic Kernel é o SDK de orquestração open-source da Microsoft que fornece abstrações para serviços de IA, plugins (funções que o modelo pode chamar) e planejadores que decompõem tarefas complexas em etapas. Ele suporta tanto Python quanto C#, integrando-se nativamente com o Azure OpenAI. Function calling (uso de ferramentas) permite que modelos invoquem ferramentas externas — APIs, bancos de dados ou código customizado — descrevendo as funções disponíveis e deixando o modelo decidir quando e como chamá-las.
Para cenários de implantação na borda, os contêineres Azure AI empacotam modelos para operação offline ou de baixa latência. Modelos em contêineres operam independentemente de conectividade com a nuvem, sendo adequados para chãos de fábrica, veículos ou redes restritas onde o acesso à nuvem é limitado ou proibido.
Arquitetura
Este desafio implementa um roteador de modelos, configura function calling com ferramentas, constrói um pipeline de orquestração multi-etapas e implanta um endpoint de modelo em contêiner.
Pré-requisitos
- Recurso Azure OpenAI com GPT-4o e GPT-4o-mini implantados
- Python 3.9+ com pacotes
openai,semantic-kernel - .NET 8 SDK com pacotes NuGet
Azure.AI.OpenAI,Microsoft.SemanticKernel - Docker Desktop (para a tarefa de implantação em contêiner)
- Azure Container Registry (opcional, para push)
Implementação
Tarefa 1: Implementar Roteador de Modelos (Baseado em Complexidade)
- Python SDK
- C# SDK
- REST API
import os
import time
from openai import AzureOpenAI
client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-10-21"
)
class ModelRouter:
"""Route requests to appropriate models based on complexity."""
COMPLEX_MODEL = "gpt-4o" # For complex reasoning tasks
SIMPLE_MODEL = "gpt-4o-mini" # For simple, fast tasks
COMPLEXITY_INDICATORS = [
"analyze", "compare", "evaluate", "synthesize",
"design", "architect", "debug", "explain why",
"multi-step", "trade-offs", "implications"
]
def classify_complexity(self, message: str) -> str:
"""Determine if a request is simple or complex."""
message_lower = message.lower()
complexity_score = sum(
1 for indicator in self.COMPLEXITY_INDICATORS
if indicator in message_lower
)
# Also consider message length as a heuristic
if len(message) > 500 or complexity_score >= 2:
return "complex"
return "simple"
def route(self, messages: list, **kwargs) -> dict:
"""Route request to appropriate model."""
user_message = next(
(m["content"] for m in reversed(messages) if m["role"] == "user"), ""
)
complexity = self.classify_complexity(user_message)
model = self.COMPLEX_MODEL if complexity == "complex" else self.SIMPLE_MODEL
start = time.time()
response = client.chat.completions.create(
model=model,
messages=messages,
**kwargs
)
latency = time.time() - start
return {
"response": response,
"model_used": model,
"complexity": complexity,
"latency_ms": latency * 1000
}
# Test the router
router = ModelRouter()
# Simple request → routes to GPT-4o-mini
result1 = router.route(
[{"role": "user", "content": "What is Azure?"}],
max_tokens=100
)
print(f"Simple: model={result1['model_used']}, "
f"latency={result1['latency_ms']:.0f}ms")
print(f" Response: {result1['response'].choices[0].message.content[:80]}...\n")
# Complex request → routes to GPT-4o
result2 = router.route(
[{"role": "user", "content": "Analyze the trade-offs between using Azure Functions Consumption plan vs Premium plan. Compare cost implications, cold start behavior, and scaling characteristics for a multi-step data processing pipeline."}],
max_tokens=300
)
print(f"Complex: model={result2['model_used']}, "
f"latency={result2['latency_ms']:.0f}ms")
print(f" Response: {result2['response'].choices[0].message.content[:80]}...")
using Azure;
using Azure.AI.OpenAI;
using OpenAI.Chat;
using System.Diagnostics;
string endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
string apiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_KEY")!;
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
class ModelRouter
{
private readonly AzureOpenAIClient _client;
private const string ComplexModel = "gpt-4o";
private const string SimpleModel = "gpt-4o-mini";
private static readonly string[] ComplexityIndicators =
["analyze", "compare", "evaluate", "synthesize", "design", "architect", "debug", "trade-offs"];
public ModelRouter(AzureOpenAIClient client) => _client = client;
public string ClassifyComplexity(string message)
{
string lower = message.ToLowerInvariant();
int score = ComplexityIndicators.Count(i => lower.Contains(i));
return (message.Length > 500 || score >= 2) ? "complex" : "simple";
}
public async Task<(ChatCompletion Result, string Model, string Complexity, double LatencyMs)>
RouteAsync(ChatMessage[] messages, ChatCompletionOptions options)
{
string userMessage = messages.OfType<UserChatMessage>().LastOrDefault()?.Content?
.FirstOrDefault()?.Text ?? "";
string complexity = ClassifyComplexity(userMessage);
string model = complexity == "complex" ? ComplexModel : SimpleModel;
ChatClient chatClient = _client.GetChatClient(model);
var sw = Stopwatch.StartNew();
ChatCompletion result = await chatClient.CompleteChatAsync(messages, options);
sw.Stop();
return (result, model, complexity, sw.Elapsed.TotalMilliseconds);
}
}
var router = new ModelRouter(azureClient);
// Simple request
var (result1, model1, complexity1, latency1) = await router.RouteAsync(
new ChatMessage[] { new UserChatMessage("What is Azure?") },
new ChatCompletionOptions { MaxOutputTokenCount = 100 });
Console.WriteLine($"Simple: model={model1}, latency={latency1:F0}ms");
Console.WriteLine($" Response: {result1.Content[0].Text[..Math.Min(80, result1.Content[0].Text.Length)]}...\n");
// Complex request
var (result2, model2, complexity2, latency2) = await router.RouteAsync(
new ChatMessage[] { new UserChatMessage("Analyze the trade-offs between Azure Functions Consumption vs Premium plan. Compare cost, cold start, and scaling.") },
new ChatCompletionOptions { MaxOutputTokenCount = 300 });
Console.WriteLine($"Complex: model={model2}, latency={latency2:F0}ms");
Console.WriteLine($" Response: {result2.Content[0].Text[..Math.Min(80, result2.Content[0].Text.Length)]}...");
# Simple request → route to gpt-4o-mini
echo "=== Simple Request (gpt-4o-mini) ==="
time curl -s -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [{"role": "user", "content": "What is Azure?"}],
"max_tokens": 100
}' | jq -r '.choices[0].message.content'
# Complex request → route to gpt-4o
echo ""
echo "=== Complex Request (gpt-4o) ==="
time curl -s -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [{"role": "user", "content": "Analyze the trade-offs between Azure Functions Consumption vs Premium plan for a multi-step data processing pipeline."}],
"max_tokens": 300
}' | jq -r '.choices[0].message.content'
Tarefa 2: Implementar Function Calling (Uso de Ferramentas)
- Python SDK
- C# SDK
- REST API
import os
import json
from openai import AzureOpenAI
client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-10-21"
)
# Define available tools (functions the model can call)
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g., 'Seattle, WA'"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
}
}
},
{
"type": "function",
"function": {
"name": "search_documents",
"description": "Search internal knowledge base for relevant documents",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
},
"top_k": {
"type": "integer",
"description": "Number of results to return",
"default": 3
}
},
"required": ["query"]
}
}
}
]
# Simulated tool implementations
def get_weather(location: str, unit: str = "celsius") -> str:
# In production, call a real weather API
return json.dumps({"location": location, "temperature": 18, "unit": unit, "condition": "partly cloudy"})
def search_documents(query: str, top_k: int = 3) -> str:
# In production, call Azure AI Search
return json.dumps({"results": [{"title": f"Doc about {query}", "snippet": f"Information about {query}..."}]})
# Function dispatch table
available_functions = {
"get_weather": get_weather,
"search_documents": search_documents
}
# Send request with tools
messages = [
{"role": "system", "content": "You help users by calling tools when needed."},
{"role": "user", "content": "What's the weather in Seattle and find docs about Azure Functions?"}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto" # Let model decide when to call tools
)
# Process tool calls
response_message = response.choices[0].message
if response_message.tool_calls:
messages.append(response_message) # Add assistant's tool call message
for tool_call in response_message.tool_calls:
function_name = tool_call.function.name
function_args = json.loads(tool_call.function.arguments)
print(f"Calling: {function_name}({function_args})")
function_response = available_functions[function_name](**function_args)
# Add tool response to messages
messages.append({
"tool_call_id": tool_call.id,
"role": "tool",
"name": function_name,
"content": function_response
})
# Get final response with tool results
final_response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
print(f"\nFinal answer: {final_response.choices[0].message.content}")
using Azure;
using Azure.AI.OpenAI;
using OpenAI.Chat;
using System.Text.Json;
string endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
string apiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_KEY")!;
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
ChatClient chatClient = azureClient.GetChatClient("gpt-4o");
// Define tools
ChatTool weatherTool = ChatTool.CreateFunctionTool(
functionName: "get_weather",
functionDescription: "Get current weather for a location",
functionParameters: BinaryData.FromString("""
{
"type": "object",
"properties": {
"location": { "type": "string", "description": "City name" },
"unit": { "type": "string", "enum": ["celsius", "fahrenheit"] }
},
"required": ["location"]
}
"""));
ChatTool searchTool = ChatTool.CreateFunctionTool(
functionName: "search_documents",
functionDescription: "Search internal knowledge base",
functionParameters: BinaryData.FromString("""
{
"type": "object",
"properties": {
"query": { "type": "string", "description": "Search query" },
"top_k": { "type": "integer", "default": 3 }
},
"required": ["query"]
}
"""));
var options = new ChatCompletionOptions();
options.Tools.Add(weatherTool);
options.Tools.Add(searchTool);
var messages = new List<ChatMessage>
{
new SystemChatMessage("You help users by calling tools when needed."),
new UserChatMessage("What's the weather in Seattle and find docs about Azure Functions?")
};
// First call - model decides to use tools
ChatCompletion completion = await chatClient.CompleteChatAsync(messages, options);
if (completion.FinishReason == ChatFinishReason.ToolCalls)
{
messages.Add(new AssistantChatMessage(completion));
foreach (ChatToolCall toolCall in completion.ToolCalls)
{
string result = toolCall.FunctionName switch
{
"get_weather" => """{"location":"Seattle","temperature":18,"condition":"cloudy"}""",
"search_documents" => """{"results":[{"title":"Azure Functions Guide"}]}""",
_ => """{"error":"Unknown function"}"""
};
Console.WriteLine($"Calling: {toolCall.FunctionName}({toolCall.FunctionArguments})");
messages.Add(new ToolChatMessage(toolCall.Id, result));
}
// Second call with tool results
ChatCompletion finalResult = await chatClient.CompleteChatAsync(messages, options);
Console.WriteLine($"\nFinal answer: {finalResult.Content[0].Text}");
}
# Function calling with tools
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "You help users by calling tools when needed."},
{"role": "user", "content": "What is the weather in Seattle?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}'
# Response will include tool_calls instead of content:
# "choices": [{"message": {"tool_calls": [{"id": "call_abc", "function": {"name": "get_weather", "arguments": "{\"location\":\"Seattle\"}"}}]}}]
# Follow up with tool result
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "You help users by calling tools when needed."},
{"role": "user", "content": "What is the weather in Seattle?"},
{"role": "assistant", "tool_calls": [{"id": "call_abc", "type": "function", "function": {"name": "get_weather", "arguments": "{\"location\":\"Seattle\"}"}}]},
{"role": "tool", "tool_call_id": "call_abc", "content": "{\"temperature\": 18, \"condition\": \"cloudy\"}"}
]
}'
Tarefa 3: Construir Pipeline Multi-Etapas com Semantic Kernel
- Python SDK
- C# SDK
- REST API
import os
import asyncio
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
from semantic_kernel.functions import kernel_function
from semantic_kernel.connectors.ai.open_ai import AzureChatPromptExecutionSettings
# Initialize Semantic Kernel
kernel = Kernel()
# Add Azure OpenAI service
kernel.add_service(
AzureChatCompletion(
deployment_name="gpt-4o",
endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
)
)
# Define plugins (native functions)
class TextAnalysisPlugin:
"""Plugin for text analysis tasks."""
@kernel_function(description="Summarize text into key points")
async def summarize(self, input: str) -> str:
settings = AzureChatPromptExecutionSettings(max_tokens=200, temperature=0)
result = await kernel.invoke_prompt(
f"Summarize the following text into 3 bullet points:\n\n{input}",
settings=settings
)
return str(result)
@kernel_function(description="Extract action items from text")
async def extract_actions(self, input: str) -> str:
settings = AzureChatPromptExecutionSettings(max_tokens=200, temperature=0)
result = await kernel.invoke_prompt(
f"Extract action items from this text as a numbered list:\n\n{input}",
settings=settings
)
return str(result)
@kernel_function(description="Determine sentiment of text")
async def analyze_sentiment(self, input: str) -> str:
settings = AzureChatPromptExecutionSettings(max_tokens=50, temperature=0)
result = await kernel.invoke_prompt(
f"What is the sentiment of this text? Reply with: positive, negative, or neutral.\n\n{input}",
settings=settings
)
return str(result)
# Register plugin
kernel.add_plugin(TextAnalysisPlugin(), plugin_name="TextAnalysis")
async def multi_step_pipeline(document: str):
"""Execute a multi-step analysis pipeline."""
print("=== Multi-Step Analysis Pipeline ===\n")
# Step 1: Summarize
print("Step 1: Summarizing...")
summary_fn = kernel.get_function("TextAnalysis", "summarize")
summary = await kernel.invoke(summary_fn, input=document)
print(f"Summary:\n{summary}\n")
# Step 2: Extract actions
print("Step 2: Extracting action items...")
actions_fn = kernel.get_function("TextAnalysis", "extract_actions")
actions = await kernel.invoke(actions_fn, input=document)
print(f"Actions:\n{actions}\n")
# Step 3: Sentiment analysis
print("Step 3: Analyzing sentiment...")
sentiment_fn = kernel.get_function("TextAnalysis", "analyze_sentiment")
sentiment = await kernel.invoke(sentiment_fn, input=document)
print(f"Sentiment: {sentiment}")
return {"summary": str(summary), "actions": str(actions), "sentiment": str(sentiment)}
# Run the pipeline
document = """
Meeting notes from Q4 planning:
The team agreed to migrate the data pipeline to Azure Data Factory by end of January.
Performance has been excellent this quarter with 99.9% uptime. However, we need to
address the rising costs in the compute cluster. Sarah will investigate spot instances.
Mike will prepare the migration plan document by next Friday.
"""
asyncio.run(multi_step_pipeline(document))
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.AzureOpenAI;
using System.ComponentModel;
string endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
string apiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_KEY")!;
// Initialize Semantic Kernel
var builder = Kernel.CreateBuilder();
builder.AddAzureOpenAIChatCompletion(
deploymentName: "gpt-4o",
endpoint: endpoint,
apiKey: apiKey);
Kernel kernel = builder.Build();
// Define plugin class
public class TextAnalysisPlugin
{
private readonly Kernel _kernel;
public TextAnalysisPlugin(Kernel kernel) => _kernel = kernel;
[KernelFunction("Summarize"), Description("Summarize text into key points")]
public async Task<string> SummarizeAsync(string input)
{
var result = await _kernel.InvokePromptAsync(
$"Summarize into 3 bullet points:\n\n{input}",
new KernelArguments(new AzureOpenAIPromptExecutionSettings
{ MaxTokens = 200, Temperature = 0 }));
return result.ToString();
}
[KernelFunction("ExtractActions"), Description("Extract action items from text")]
public async Task<string> ExtractActionsAsync(string input)
{
var result = await _kernel.InvokePromptAsync(
$"Extract action items as a numbered list:\n\n{input}",
new KernelArguments(new AzureOpenAIPromptExecutionSettings
{ MaxTokens = 200, Temperature = 0 }));
return result.ToString();
}
[KernelFunction("AnalyzeSentiment"), Description("Analyze text sentiment")]
public async Task<string> AnalyzeSentimentAsync(string input)
{
var result = await _kernel.InvokePromptAsync(
$"Sentiment (positive/negative/neutral):\n\n{input}",
new KernelArguments(new AzureOpenAIPromptExecutionSettings
{ MaxTokens = 50, Temperature = 0 }));
return result.ToString();
}
}
// Register and execute pipeline
kernel.Plugins.AddFromObject(new TextAnalysisPlugin(kernel), "TextAnalysis");
string document = """
Meeting notes: Migrate data pipeline to Azure Data Factory by January.
99.9% uptime achieved. Rising compute costs need attention.
Sarah investigates spot instances. Mike prepares migration plan by Friday.
""";
Console.WriteLine("=== Multi-Step Pipeline ===\n");
var summary = await kernel.InvokeAsync("TextAnalysis", "Summarize",
new KernelArguments { ["input"] = document });
Console.WriteLine($"Summary:\n{summary}\n");
var actions = await kernel.InvokeAsync("TextAnalysis", "ExtractActions",
new KernelArguments { ["input"] = document });
Console.WriteLine($"Actions:\n{actions}\n");
var sentiment = await kernel.InvokeAsync("TextAnalysis", "AnalyzeSentiment",
new KernelArguments { ["input"] = document });
Console.WriteLine($"Sentiment: {sentiment}");
# Multi-step pipeline via sequential REST calls
# Step 1: Summarize
echo "=== Step 1: Summarize ==="
SUMMARY=$(curl -s -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "Summarize into 3 bullet points."},
{"role": "user", "content": "Meeting notes: Migrate pipeline to ADF by January. 99.9% uptime. Rising compute costs. Sarah investigates spot instances. Mike prepares migration plan by Friday."}
],
"temperature": 0,
"max_tokens": 200
}' | jq -r '.choices[0].message.content')
echo "$SUMMARY"
# Step 2: Extract actions from the summary
echo ""
echo "=== Step 2: Extract Actions ==="
curl -s -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d "{
\"messages\": [
{\"role\": \"system\", \"content\": \"Extract action items as a numbered list.\"},
{\"role\": \"user\", \"content\": \"$SUMMARY\"}
],
\"temperature\": 0,
\"max_tokens\": 200
}" | jq -r '.choices[0].message.content'
# Step 3: Sentiment
echo ""
echo "=== Step 3: Sentiment ==="
curl -s -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d "{
\"messages\": [
{\"role\": \"user\", \"content\": \"Sentiment (positive/negative/neutral): $SUMMARY\"}
],
\"temperature\": 0,
\"max_tokens\": 10
}" | jq -r '.choices[0].message.content'
Tarefa 4: Implantar Endpoint de Modelo em Contêiner
- Python SDK
- C# SDK
- REST API
import os
import subprocess
import requests
# Deploy a containerized Azure AI model for edge/offline scenarios
# This example uses Azure AI containers for text analytics
# Step 1: Pull the container image
# docker pull mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest
# Step 2: Run locally with configuration
container_config = {
"image": "mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest",
"ports": {"5000/tcp": 5000},
"environment": {
"Eula": "accept",
"Billing": os.environ["AZURE_AI_ENDPOINT"],
"ApiKey": os.environ["AZURE_AI_KEY"]
}
}
# Start container (equivalent Docker command shown)
print("Starting container...")
print(f"docker run -d -p 5000:5000 \\")
print(f" -e Eula=accept \\")
print(f" -e Billing={os.environ.get('AZURE_AI_ENDPOINT', '<endpoint>')} \\")
print(f" -e ApiKey={os.environ.get('AZURE_AI_KEY', '<key>')} \\")
print(f" mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest")
# Step 3: Call the containerized endpoint
def analyze_sentiment_local(text: str, port: int = 5000) -> dict:
"""Call the local container endpoint."""
response = requests.post(
f"http://localhost:{port}/text/analytics/v3.1/sentiment",
json={
"documents": [
{"id": "1", "language": "en", "text": text}
]
}
)
return response.json()
# Test the container
# result = analyze_sentiment_local("Azure AI services are excellent and easy to use!")
# print(f"Sentiment: {result['documents'][0]['sentiment']}")
# Step 4: For Azure OpenAI proxy pattern (APIM or custom gateway)
from openai import AzureOpenAI
# Custom base URL pointing to local container or edge gateway
edge_client = AzureOpenAI(
azure_endpoint="http://localhost:8080", # Local proxy
api_key="local-key",
api_version="2024-10-21"
)
print("\nEdge deployment pattern configured")
print("Container runs independently of cloud connectivity")
print("Billing endpoint required for meter reporting only")
using Azure;
using Azure.AI.TextAnalytics;
using System.Net.Http;
using System.Text.Json;
// Connect to containerized Azure AI service (local endpoint)
string containerEndpoint = "http://localhost:5000";
string billingKey = Environment.GetEnvironmentVariable("AZURE_AI_KEY")!;
// TextAnalyticsClient works with containerized endpoints
var client = new TextAnalyticsClient(
new Uri(containerEndpoint),
new AzureKeyCredential(billingKey));
// Analyze sentiment via local container
DocumentSentiment sentiment = await client.AnalyzeSentimentAsync(
"Azure AI containers enable edge deployment scenarios.");
Console.WriteLine($"Sentiment: {sentiment.Sentiment}");
Console.WriteLine($"Positive: {sentiment.ConfidenceScores.Positive:F2}");
Console.WriteLine($"Neutral: {sentiment.ConfidenceScores.Neutral:F2}");
Console.WriteLine($"Negative: {sentiment.ConfidenceScores.Negative:F2}");
// Health check for container
using var httpClient = new HttpClient();
var healthResponse = await httpClient.GetAsync($"{containerEndpoint}/status");
Console.WriteLine($"\nContainer health: {healthResponse.StatusCode}");
// Docker Compose for multi-container deployment
string dockerCompose = """
version: '3.8'
services:
sentiment:
image: mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest
ports:
- "5000:5000"
environment:
- Eula=accept
- Billing=${AZURE_AI_ENDPOINT}
- ApiKey=${AZURE_AI_KEY}
language:
image: mcr.microsoft.com/azure-cognitive-services/textanalytics/language:latest
ports:
- "5001:5000"
environment:
- Eula=accept
- Billing=${AZURE_AI_ENDPOINT}
- ApiKey=${AZURE_AI_KEY}
""";
Console.WriteLine($"\nDocker Compose configuration:\n{dockerCompose}");
# Deploy Azure AI container for edge scenarios
# Step 1: Pull container image
docker pull mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest
# Step 2: Run container locally
docker run -d --name ai-sentiment \
-p 5000:5000 \
-e Eula=accept \
-e Billing="${AZURE_AI_ENDPOINT}" \
-e ApiKey="${AZURE_AI_KEY}" \
mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest
# Step 3: Wait for container to be ready
echo "Waiting for container..."
until curl -s http://localhost:5000/status | grep -q "ready"; do
sleep 2
done
echo "Container ready!"
# Step 4: Call local container endpoint
curl -X POST "http://localhost:5000/text/analytics/v3.1/sentiment" \
-H "Content-Type: application/json" \
-d '{
"documents": [
{"id": "1", "language": "en", "text": "Azure AI containers enable offline deployment."}
]
}'
# Step 5: Push to Azure Container Registry for deployment
az acr create --name acrai102challenge20 \
--resource-group rg-ai102-challenge20 \
--sku Basic
# Tag and push custom orchestration container
docker tag my-ai-orchestrator:latest acrai102challenge20.azurecr.io/ai-orchestrator:v1
docker push acrai102challenge20.azurecr.io/ai-orchestrator:v1
# Deploy to Azure Container Instances
az container create \
--resource-group rg-ai102-challenge20 \
--name ai-orchestrator \
--image acrai102challenge20.azurecr.io/ai-orchestrator:v1 \
--cpu 2 --memory 4 \
--ports 8080 \
--environment-variables \
AZURE_OPENAI_ENDPOINT="${AZURE_OPENAI_ENDPOINT}" \
AZURE_OPENAI_KEY="${AZURE_OPENAI_KEY}"
Saída Esperada
Simple: model=gpt-4o-mini, latency=285ms
Response: Azure is Microsoft's cloud computing platform that provides a wide range of...
Complex: model=gpt-4o, latency=1250ms
Response: When comparing Azure Functions Consumption and Premium plans, several key trade...
Calling: get_weather({"location": "Seattle", "unit": "celsius"})
Calling: search_documents({"query": "Azure Functions"})
Final answer: The weather in Seattle is currently 18°C and partly cloudy. I also found
documentation about Azure Functions in our knowledge base...
=== Multi-Step Analysis Pipeline ===
Step 1: Summarizing...
Summary:
• Migration to Azure Data Factory planned for end of January
• Excellent Q4 performance with 99.9% uptime
• Rising compute costs need investigation (spot instances)
Step 2: Extracting actions...
Actions:
1. Sarah: Investigate spot instances for compute cluster
2. Mike: Prepare migration plan document by next Friday
Step 3: Analyzing sentiment...
Sentiment: positive
Quebra & conserta
| Cenário | Sintoma | Causa Raiz | Correção |
|---|---|---|---|
| Função não chamada | Modelo responde diretamente sem tool call | Descrição da função confusa ou irrelevante | Melhorar descrições das funções; usar tool_choice: "required" |
| Loop infinito de ferramentas | Modelo continua chamando a mesma função | Sem condição de terminação | Limitar rodadas de tool call; adicionar lógica de "done" |
| Erro no plugin do Semantic Kernel | Exceção FunctionNotFound | Plugin não registrado ou nome de função incorreto | Verificar chamada add_plugin() e se o nome da função corresponde |
| Contêiner falha ao iniciar | Erro Eula=accept ausente | EULA não aceito | Definir variável de ambiente Eula=accept |
| Erro de billing no contêiner | Contêiner para após 10-15 min | Endpoint de billing inacessível | Garantir que a URL de Billing está acessível; verificar rede |
Verificação de Conhecimento
1. No function calling do Azure OpenAI, o que o modelo retorna quando decide usar uma ferramenta?
2. Qual é a principal vantagem de usar um roteador de modelos em arquiteturas multi-modelo?
3. Qual variável de ambiente é necessária para que os contêineres Azure AI funcionem corretamente?
4. No Semantic Kernel, o que é um 'plugin'?
5. Ao implementar function calling, o que acontece depois que a aplicação executa a função e retorna os resultados?
Limpeza
# Stop and remove containers
docker stop ai-sentiment && docker rm ai-sentiment
# Delete Azure resources
az group delete --name rg-ai102-challenge20 --yes --no-wait