Challenge 17: Prompt Engineering and Templates
45-60 min | Cost: ~$0.50 (estimated) | Domain: Generative AI Solutions (15-20%)
Exam skills covered
- Submit prompts to generative AI models
- Utilize prompt templates
- Apply prompt engineering techniques for optimal results
Overview
Prompt engineering is the practice of designing and optimizing inputs to language models to elicit desired outputs. Azure OpenAI's chat completions API uses a structured message format with three roles: system (sets behavior and constraints), user (provides the request), and assistant (models prior responses for context). Understanding how to configure these messages effectively is fundamental to building reliable AI applications.
Key parameters control response generation: temperature (0-2, controls randomness), top_p (nucleus sampling threshold), max_tokens (output length limit), frequency_penalty (-2 to 2, reduces repetition), and presence_penalty (-2 to 2, encourages topic diversity). These parameters interact—for example, temperature and top_p both affect randomness, so Microsoft recommends adjusting one at a time.
Advanced techniques include few-shot prompting (providing examples in the prompt), chain-of-thought (requesting step-by-step reasoning), and structured outputs (using JSON mode or response_format to ensure parseable output). Prompt templates allow reusable prompt patterns with variable substitution, enabling consistent behavior across application scenarios.
Architecture
This challenge explores prompt construction, parameter tuning, and template-based approaches for building consistent and reliable generative AI interactions.
Prerequisites
- Azure OpenAI resource with GPT-4o deployed (from Challenge 16)
- Python 3.9+ with
openaipackage installed - .NET 8 SDK with
Azure.AI.OpenAINuGet package - Environment variables
AZURE_OPENAI_ENDPOINTandAZURE_OPENAI_KEYconfigured
Implementation
Task 1: Configure System Prompts and Message Roles
- Python SDK
- C# SDK
- REST API
import os
from openai import AzureOpenAI
client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-10-21"
)
# System message defines the assistant's behavior and constraints
messages = [
{
"role": "system",
"content": (
"You are a technical documentation assistant for Azure services. "
"Always respond with accurate, concise technical information. "
"Format responses using markdown. "
"If you are unsure about something, say so rather than guessing. "
"Do not provide information about non-Azure cloud services."
)
},
{
"role": "user",
"content": "What is the maximum file size for Azure Blob Storage?"
}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
max_tokens=200
)
print(response.choices[0].message.content)
# Multi-turn conversation with assistant role for context
messages.append({"role": "assistant", "content": response.choices[0].message.content})
messages.append({"role": "user", "content": "How does that compare to the block blob limit?"})
response2 = client.chat.completions.create(
model="gpt-4o",
messages=messages,
max_tokens=200
)
print(f"\nFollow-up: {response2.choices[0].message.content}")
using Azure;
using Azure.AI.OpenAI;
using OpenAI.Chat;
string endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
string apiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_KEY")!;
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
ChatClient chatClient = azureClient.GetChatClient("gpt-4o");
// System message defines behavior and constraints
var messages = new List<ChatMessage>
{
new SystemChatMessage(
"You are a technical documentation assistant for Azure services. " +
"Always respond with accurate, concise technical information. " +
"Format responses using markdown. " +
"If you are unsure about something, say so rather than guessing. " +
"Do not provide information about non-Azure cloud services."),
new UserChatMessage("What is the maximum file size for Azure Blob Storage?")
};
ChatCompletion completion = await chatClient.CompleteChatAsync(
messages,
new ChatCompletionOptions { MaxOutputTokenCount = 200 });
Console.WriteLine(completion.Content[0].Text);
// Multi-turn with assistant context
messages.Add(new AssistantChatMessage(completion.Content[0].Text));
messages.Add(new UserChatMessage("How does that compare to the block blob limit?"));
ChatCompletion followUp = await chatClient.CompleteChatAsync(
messages,
new ChatCompletionOptions { MaxOutputTokenCount = 200 });
Console.WriteLine($"\nFollow-up: {followUp.Content[0].Text}");
# Chat completions with system, user, and assistant roles
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{
"role": "system",
"content": "You are a technical documentation assistant for Azure services. Always respond with accurate, concise technical information. Format responses using markdown."
},
{
"role": "user",
"content": "What is the maximum file size for Azure Blob Storage?"
}
],
"max_tokens": 200
}'
# Multi-turn conversation
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "You are a technical documentation assistant."},
{"role": "user", "content": "What is the max blob size?"},
{"role": "assistant", "content": "The maximum size for a single block blob is approximately 190.7 TiB."},
{"role": "user", "content": "How does that compare to the block blob limit?"}
],
"max_tokens": 200
}'
Task 2: Experiment with Temperature and Sampling Parameters
- Python SDK
- C# SDK
- REST API
import os
from openai import AzureOpenAI
client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-10-21"
)
prompt = [
{"role": "system", "content": "You are a creative writing assistant."},
{"role": "user", "content": "Write a one-sentence description of cloud computing."}
]
# Temperature 0: Deterministic, consistent output
print("=== Temperature 0 (Deterministic) ===")
for i in range(3):
response = client.chat.completions.create(
model="gpt-4o",
messages=prompt,
temperature=0,
max_tokens=50
)
print(f" Run {i+1}: {response.choices[0].message.content}")
# Temperature 1.5: High creativity, more variation
print("\n=== Temperature 1.5 (Creative) ===")
for i in range(3):
response = client.chat.completions.create(
model="gpt-4o",
messages=prompt,
temperature=1.5,
max_tokens=50
)
print(f" Run {i+1}: {response.choices[0].message.content}")
# Frequency and presence penalties for diversity
print("\n=== With Penalties (Reduce Repetition) ===")
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "List 5 benefits of cloud computing."},
{"role": "user", "content": "Go."}
],
temperature=0.7,
frequency_penalty=0.5, # Reduce word repetition
presence_penalty=0.5, # Encourage new topics
max_tokens=200
)
print(response.choices[0].message.content)
using Azure;
using Azure.AI.OpenAI;
using OpenAI.Chat;
string endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
string apiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_KEY")!;
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
ChatClient chatClient = azureClient.GetChatClient("gpt-4o");
var prompt = new ChatMessage[]
{
new SystemChatMessage("You are a creative writing assistant."),
new UserChatMessage("Write a one-sentence description of cloud computing.")
};
// Temperature 0: Deterministic output
Console.WriteLine("=== Temperature 0 (Deterministic) ===");
for (int i = 0; i < 3; i++)
{
var result = await chatClient.CompleteChatAsync(prompt,
new ChatCompletionOptions { Temperature = 0f, MaxOutputTokenCount = 50 });
Console.WriteLine($" Run {i + 1}: {result.Value.Content[0].Text}");
}
// Temperature 1.5: High creativity
Console.WriteLine("\n=== Temperature 1.5 (Creative) ===");
for (int i = 0; i < 3; i++)
{
var result = await chatClient.CompleteChatAsync(prompt,
new ChatCompletionOptions { Temperature = 1.5f, MaxOutputTokenCount = 50 });
Console.WriteLine($" Run {i + 1}: {result.Value.Content[0].Text}");
}
// With penalties
Console.WriteLine("\n=== With Penalties (Reduce Repetition) ===");
var penaltyResult = await chatClient.CompleteChatAsync(
new ChatMessage[]
{
new SystemChatMessage("List 5 benefits of cloud computing."),
new UserChatMessage("Go.")
},
new ChatCompletionOptions
{
Temperature = 0.7f,
FrequencyPenalty = 0.5f,
PresencePenalty = 0.5f,
MaxOutputTokenCount = 200
});
Console.WriteLine(penaltyResult.Value.Content[0].Text);
# Temperature 0: Deterministic
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "You are a creative writing assistant."},
{"role": "user", "content": "Write a one-sentence description of cloud computing."}
],
"temperature": 0,
"max_tokens": 50
}'
# Temperature 1.5: Creative
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "You are a creative writing assistant."},
{"role": "user", "content": "Write a one-sentence description of cloud computing."}
],
"temperature": 1.5,
"max_tokens": 50
}'
# With frequency and presence penalties
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "List 5 benefits of cloud computing."},
{"role": "user", "content": "Go."}
],
"temperature": 0.7,
"frequency_penalty": 0.5,
"presence_penalty": 0.5,
"max_tokens": 200
}'
Task 3: Implement Few-Shot Prompting
- Python SDK
- C# SDK
- REST API
import os
from openai import AzureOpenAI
client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-10-21"
)
# Few-shot prompting: provide examples to guide the model
messages = [
{
"role": "system",
"content": "You classify customer support tickets into categories. Respond with only the category name."
},
# Example 1
{"role": "user", "content": "I can't log into my account, it says password is wrong"},
{"role": "assistant", "content": "Authentication"},
# Example 2
{"role": "user", "content": "My monthly bill seems higher than expected"},
{"role": "assistant", "content": "Billing"},
# Example 3
{"role": "user", "content": "The app crashes when I try to upload a file larger than 10MB"},
{"role": "assistant", "content": "Bug Report"},
# Actual request
{"role": "user", "content": "I want to add another user to my team subscription"}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
temperature=0,
max_tokens=10
)
print(f"Classification: {response.choices[0].message.content}")
# Chain-of-thought prompting
cot_messages = [
{
"role": "system",
"content": "You are a math tutor. Show your reasoning step by step before giving the final answer."
},
{
"role": "user",
"content": "A store has 3 shelves. Each shelf has 4 boxes. Each box contains 6 items. If 15 items are sold, how many remain?"
}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=cot_messages,
temperature=0,
max_tokens=300
)
print(f"\nChain-of-thought:\n{response.choices[0].message.content}")
using Azure;
using Azure.AI.OpenAI;
using OpenAI.Chat;
string endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
string apiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_KEY")!;
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
ChatClient chatClient = azureClient.GetChatClient("gpt-4o");
// Few-shot prompting with examples
var fewShotMessages = new ChatMessage[]
{
new SystemChatMessage("You classify customer support tickets into categories. Respond with only the category name."),
new UserChatMessage("I can't log into my account, it says password is wrong"),
new AssistantChatMessage("Authentication"),
new UserChatMessage("My monthly bill seems higher than expected"),
new AssistantChatMessage("Billing"),
new UserChatMessage("The app crashes when I try to upload a file larger than 10MB"),
new AssistantChatMessage("Bug Report"),
new UserChatMessage("I want to add another user to my team subscription")
};
var result = await chatClient.CompleteChatAsync(fewShotMessages,
new ChatCompletionOptions { Temperature = 0f, MaxOutputTokenCount = 10 });
Console.WriteLine($"Classification: {result.Value.Content[0].Text}");
// Chain-of-thought prompting
var cotMessages = new ChatMessage[]
{
new SystemChatMessage("You are a math tutor. Show your reasoning step by step before giving the final answer."),
new UserChatMessage("A store has 3 shelves. Each shelf has 4 boxes. Each box contains 6 items. If 15 items are sold, how many remain?")
};
var cotResult = await chatClient.CompleteChatAsync(cotMessages,
new ChatCompletionOptions { Temperature = 0f, MaxOutputTokenCount = 300 });
Console.WriteLine($"\nChain-of-thought:\n{cotResult.Value.Content[0].Text}");
# Few-shot prompting with examples
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "You classify customer support tickets. Respond with only the category name."},
{"role": "user", "content": "I cannot log into my account"},
{"role": "assistant", "content": "Authentication"},
{"role": "user", "content": "My bill seems too high"},
{"role": "assistant", "content": "Billing"},
{"role": "user", "content": "I want to add another user to my team subscription"}
],
"temperature": 0,
"max_tokens": 10
}'
# Chain-of-thought prompting
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "You are a math tutor. Show your reasoning step by step."},
{"role": "user", "content": "A store has 3 shelves with 4 boxes each, 6 items per box. 15 items are sold. How many remain?"}
],
"temperature": 0,
"max_tokens": 300
}'
Task 4: Use JSON Mode for Structured Output
- Python SDK
- C# SDK
- REST API
import os
import json
from openai import AzureOpenAI
client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-10-21"
)
# JSON mode: ensures valid JSON output
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "You extract structured data from text. Always respond with valid JSON."
},
{
"role": "user",
"content": "Extract entities: 'John Smith from Microsoft called on March 15, 2024 about Azure OpenAI pricing for their Seattle office.'"
}
],
response_format={"type": "json_object"},
temperature=0,
max_tokens=300
)
result = json.loads(response.choices[0].message.content)
print(json.dumps(result, indent=2))
# Prompt template with variable substitution
def create_analysis_prompt(product_name: str, review_text: str) -> list:
"""Reusable prompt template for product review analysis."""
return [
{
"role": "system",
"content": (
"You are a product review analyst. Analyze the review and return JSON with: "
"sentiment (positive/negative/neutral), key_points (array), "
"rating_estimate (1-5), and summary (one sentence)."
)
},
{
"role": "user",
"content": f"Product: {product_name}\nReview: {review_text}"
}
]
# Use the template
review = "The new Azure AI Studio is fantastic! Easy to use and very powerful."
messages = create_analysis_prompt("Azure AI Studio", review)
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
response_format={"type": "json_object"},
temperature=0,
max_tokens=200
)
analysis = json.loads(response.choices[0].message.content)
print(f"\nSentiment: {analysis.get('sentiment')}")
print(f"Rating: {analysis.get('rating_estimate')}/5")
print(f"Summary: {analysis.get('summary')}")
using Azure;
using Azure.AI.OpenAI;
using OpenAI.Chat;
using System.Text.Json;
string endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
string apiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_KEY")!;
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
ChatClient chatClient = azureClient.GetChatClient("gpt-4o");
// JSON mode for structured output
var jsonResult = await chatClient.CompleteChatAsync(
new ChatMessage[]
{
new SystemChatMessage("You extract structured data from text. Always respond with valid JSON."),
new UserChatMessage("Extract entities: 'John Smith from Microsoft called on March 15, 2024 about Azure OpenAI pricing for their Seattle office.'")
},
new ChatCompletionOptions
{
ResponseFormat = ChatResponseFormat.CreateJsonObjectFormat(),
Temperature = 0f,
MaxOutputTokenCount = 300
});
var jsonDoc = JsonDocument.Parse(jsonResult.Value.Content[0].Text);
Console.WriteLine(JsonSerializer.Serialize(jsonDoc, new JsonSerializerOptions { WriteIndented = true }));
// Prompt template method
ChatMessage[] CreateAnalysisPrompt(string productName, string reviewText) =>
new ChatMessage[]
{
new SystemChatMessage(
"You are a product review analyst. Analyze the review and return JSON with: " +
"sentiment (positive/negative/neutral), key_points (array), " +
"rating_estimate (1-5), and summary (one sentence)."),
new UserChatMessage($"Product: {productName}\nReview: {reviewText}")
};
// Use the template
var analysisResult = await chatClient.CompleteChatAsync(
CreateAnalysisPrompt("Azure AI Studio",
"The new Azure AI Studio is fantastic! Easy to use and very powerful."),
new ChatCompletionOptions
{
ResponseFormat = ChatResponseFormat.CreateJsonObjectFormat(),
Temperature = 0f,
MaxOutputTokenCount = 200
});
var analysis = JsonDocument.Parse(analysisResult.Value.Content[0].Text);
Console.WriteLine($"\nAnalysis: {JsonSerializer.Serialize(analysis, new JsonSerializerOptions { WriteIndented = true })}");
# JSON mode for structured output
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "You extract structured data from text. Always respond with valid JSON."},
{"role": "user", "content": "Extract entities: John Smith from Microsoft called on March 15, 2024 about Azure OpenAI pricing for their Seattle office."}
],
"response_format": {"type": "json_object"},
"temperature": 0,
"max_tokens": 300
}'
# Prompt template pattern (variables replaced in the request)
curl -X POST "https://${AZURE_OPENAI_ENDPOINT}/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
-H "Content-Type: application/json" \
-H "api-key: ${AZURE_OPENAI_KEY}" \
-d '{
"messages": [
{"role": "system", "content": "Analyze the product review and return JSON with: sentiment, key_points, rating_estimate (1-5), summary."},
{"role": "user", "content": "Product: Azure AI Studio\nReview: The new Azure AI Studio is fantastic! Easy to use and very powerful."}
],
"response_format": {"type": "json_object"},
"temperature": 0,
"max_tokens": 200
}'
Expected Output
Classification: Account Management
Chain-of-thought:
Step 1: Calculate total items
- 3 shelves × 4 boxes = 12 boxes
- 12 boxes × 6 items = 72 items total
Step 2: Subtract sold items
- 72 items - 15 sold = 57 items remaining
**Answer: 57 items remain**
{
"person": "John Smith",
"company": "Microsoft",
"date": "March 15, 2024",
"topic": "Azure OpenAI pricing",
"location": "Seattle"
}
Sentiment: positive
Rating: 5/5
Summary: The reviewer finds Azure AI Studio excellent for its ease of use and power.
Break & fix
| Scenario | Symptom | Root Cause | Fix |
|---|---|---|---|
| JSON mode returns error | response_format is not supported | Model or API version doesn't support JSON mode | Use GPT-4o with API version 2024-10-21+ |
| JSON output is invalid | Partial JSON or truncated | max_tokens too low for response | Increase max_tokens or simplify requested schema |
| Few-shot not working | Model ignores examples | Examples not consistent or too few | Ensure 3+ diverse examples with consistent format |
| Temperature 0 varies | Slightly different outputs at temp=0 | Expected behavior due to floating-point | Use seed parameter for reproducibility |
| System prompt ignored | Model doesn't follow constraints | User message overrides system message | Reinforce constraints in system message; use more explicit instructions |
Knowledge Check
1. What is the recommended approach when you want both low randomness and high quality in Azure OpenAI responses?
2. In the chat completions API, what is the purpose of including assistant messages in the messages array?
3. What must be true when using response_format: {type: 'json_object'} with Azure OpenAI?
4. Which parameter reduces the likelihood of the model repeating the same words or phrases it has already used?
5. What is the key difference between zero-shot and few-shot prompting?
Cleanup
az group delete --name rg-ai102-challenge17 --yes --no-wait