Skip to main content

Challenge 17: Prompt Engineering and Templates

Estimated Time

45-60 min | Cost: ~$0.50 (estimated) | Domain: Generative AI Solutions (15-20%)

Exam skills covered

  • Submit prompts to generative AI models
  • Utilize prompt templates
  • Apply prompt engineering techniques for optimal results

Overview

Prompt engineering is the practice of designing and optimizing inputs to language models to elicit desired outputs. Azure OpenAI's chat completions API uses a structured message format with three roles: system (sets behavior and constraints), user (provides the request), and assistant (models prior responses for context). Understanding how to configure these messages effectively is fundamental to building reliable AI applications.

Key parameters control response generation: temperature (0-2, controls randomness), top_p (nucleus sampling threshold), max_tokens (output length limit), frequency_penalty (-2 to 2, reduces repetition), and presence_penalty (-2 to 2, encourages topic diversity). These parameters interact—for example, temperature and top_p both affect randomness, so Microsoft recommends adjusting one at a time.

Advanced techniques include few-shot prompting (providing examples in the prompt), chain-of-thought (requesting step-by-step reasoning), and structured outputs (using JSON mode or response_format to ensure parseable output). Prompt templates allow reusable prompt patterns with variable substitution, enabling consistent behavior across application scenarios.

Architecture

This challenge explores prompt construction, parameter tuning, and template-based approaches for building consistent and reliable generative AI interactions.

Challenge 17 topology

Prerequisites

  • Azure OpenAI resource with GPT-4o deployed (from Challenge 16)
  • Python 3.9+ with openai package installed
  • .NET 8 SDK with Azure.AI.OpenAI NuGet package
  • Environment variables AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY configured

Implementation

Task 1: Configure System Prompts and Message Roles

import os
from openai import AzureOpenAI

client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-10-21"
)

# System message defines the assistant's behavior and constraints
messages = [
{
"role": "system",
"content": (
"You are a technical documentation assistant for Azure services. "
"Always respond with accurate, concise technical information. "
"Format responses using markdown. "
"If you are unsure about something, say so rather than guessing. "
"Do not provide information about non-Azure cloud services."
)
},
{
"role": "user",
"content": "What is the maximum file size for Azure Blob Storage?"
}
]

response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
max_tokens=200
)

print(response.choices[0].message.content)

# Multi-turn conversation with assistant role for context
messages.append({"role": "assistant", "content": response.choices[0].message.content})
messages.append({"role": "user", "content": "How does that compare to the block blob limit?"})

response2 = client.chat.completions.create(
model="gpt-4o",
messages=messages,
max_tokens=200
)

print(f"\nFollow-up: {response2.choices[0].message.content}")

Task 2: Experiment with Temperature and Sampling Parameters

import os
from openai import AzureOpenAI

client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-10-21"
)

prompt = [
{"role": "system", "content": "You are a creative writing assistant."},
{"role": "user", "content": "Write a one-sentence description of cloud computing."}
]

# Temperature 0: Deterministic, consistent output
print("=== Temperature 0 (Deterministic) ===")
for i in range(3):
response = client.chat.completions.create(
model="gpt-4o",
messages=prompt,
temperature=0,
max_tokens=50
)
print(f" Run {i+1}: {response.choices[0].message.content}")

# Temperature 1.5: High creativity, more variation
print("\n=== Temperature 1.5 (Creative) ===")
for i in range(3):
response = client.chat.completions.create(
model="gpt-4o",
messages=prompt,
temperature=1.5,
max_tokens=50
)
print(f" Run {i+1}: {response.choices[0].message.content}")

# Frequency and presence penalties for diversity
print("\n=== With Penalties (Reduce Repetition) ===")
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "List 5 benefits of cloud computing."},
{"role": "user", "content": "Go."}
],
temperature=0.7,
frequency_penalty=0.5, # Reduce word repetition
presence_penalty=0.5, # Encourage new topics
max_tokens=200
)
print(response.choices[0].message.content)

Task 3: Implement Few-Shot Prompting

import os
from openai import AzureOpenAI

client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-10-21"
)

# Few-shot prompting: provide examples to guide the model
messages = [
{
"role": "system",
"content": "You classify customer support tickets into categories. Respond with only the category name."
},
# Example 1
{"role": "user", "content": "I can't log into my account, it says password is wrong"},
{"role": "assistant", "content": "Authentication"},
# Example 2
{"role": "user", "content": "My monthly bill seems higher than expected"},
{"role": "assistant", "content": "Billing"},
# Example 3
{"role": "user", "content": "The app crashes when I try to upload a file larger than 10MB"},
{"role": "assistant", "content": "Bug Report"},
# Actual request
{"role": "user", "content": "I want to add another user to my team subscription"}
]

response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
temperature=0,
max_tokens=10
)

print(f"Classification: {response.choices[0].message.content}")

# Chain-of-thought prompting
cot_messages = [
{
"role": "system",
"content": "You are a math tutor. Show your reasoning step by step before giving the final answer."
},
{
"role": "user",
"content": "A store has 3 shelves. Each shelf has 4 boxes. Each box contains 6 items. If 15 items are sold, how many remain?"
}
]

response = client.chat.completions.create(
model="gpt-4o",
messages=cot_messages,
temperature=0,
max_tokens=300
)

print(f"\nChain-of-thought:\n{response.choices[0].message.content}")

Task 4: Use JSON Mode for Structured Output

import os
import json
from openai import AzureOpenAI

client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-10-21"
)

# JSON mode: ensures valid JSON output
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "You extract structured data from text. Always respond with valid JSON."
},
{
"role": "user",
"content": "Extract entities: 'John Smith from Microsoft called on March 15, 2024 about Azure OpenAI pricing for their Seattle office.'"
}
],
response_format={"type": "json_object"},
temperature=0,
max_tokens=300
)

result = json.loads(response.choices[0].message.content)
print(json.dumps(result, indent=2))

# Prompt template with variable substitution
def create_analysis_prompt(product_name: str, review_text: str) -> list:
"""Reusable prompt template for product review analysis."""
return [
{
"role": "system",
"content": (
"You are a product review analyst. Analyze the review and return JSON with: "
"sentiment (positive/negative/neutral), key_points (array), "
"rating_estimate (1-5), and summary (one sentence)."
)
},
{
"role": "user",
"content": f"Product: {product_name}\nReview: {review_text}"
}
]

# Use the template
review = "The new Azure AI Studio is fantastic! Easy to use and very powerful."
messages = create_analysis_prompt("Azure AI Studio", review)

response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
response_format={"type": "json_object"},
temperature=0,
max_tokens=200
)

analysis = json.loads(response.choices[0].message.content)
print(f"\nSentiment: {analysis.get('sentiment')}")
print(f"Rating: {analysis.get('rating_estimate')}/5")
print(f"Summary: {analysis.get('summary')}")

Expected Output

Classification: Account Management

Chain-of-thought:
Step 1: Calculate total items
- 3 shelves × 4 boxes = 12 boxes
- 12 boxes × 6 items = 72 items total

Step 2: Subtract sold items
- 72 items - 15 sold = 57 items remaining

**Answer: 57 items remain**

{
"person": "John Smith",
"company": "Microsoft",
"date": "March 15, 2024",
"topic": "Azure OpenAI pricing",
"location": "Seattle"
}

Sentiment: positive
Rating: 5/5
Summary: The reviewer finds Azure AI Studio excellent for its ease of use and power.

Break & fix

ScenarioSymptomRoot CauseFix
JSON mode returns errorresponse_format is not supportedModel or API version doesn't support JSON modeUse GPT-4o with API version 2024-10-21+
JSON output is invalidPartial JSON or truncatedmax_tokens too low for responseIncrease max_tokens or simplify requested schema
Few-shot not workingModel ignores examplesExamples not consistent or too fewEnsure 3+ diverse examples with consistent format
Temperature 0 variesSlightly different outputs at temp=0Expected behavior due to floating-pointUse seed parameter for reproducibility
System prompt ignoredModel doesn't follow constraintsUser message overrides system messageReinforce constraints in system message; use more explicit instructions

Knowledge Check

1. What is the recommended approach when you want both low randomness and high quality in Azure OpenAI responses?

2. In the chat completions API, what is the purpose of including assistant messages in the messages array?

3. What must be true when using response_format: {type: 'json_object'} with Azure OpenAI?

4. Which parameter reduces the likelihood of the model repeating the same words or phrases it has already used?

5. What is the key difference between zero-shot and few-shot prompting?

Cleanup

az group delete --name rg-ai102-challenge17 --yes --no-wait

Learn More