Skip to main content

Challenge 27: Log Analytics & KQL deep dive

Estimated Time and Cost

75-90 minutes | Estimated cost: ~$0.15 | Exam Weight: 10-15%

Scenario

Contoso Ltd. needs centralized logging with powerful query capabilities to meet both compliance and operational needs. The operations team must collect logs from VMs, Azure resources, and applications into a single Log Analytics workspace, then write KQL queries to analyze performance, detect anomalies, and create visualizations in workbooks.

Exam skills covered

  • Create and configure Log Analytics workspace
  • Configure log settings in Azure Monitor
  • Configure data collection rules (DCR)
  • Install and configure Azure Monitor Agent (AMA)
  • Query and analyze logs using KQL (where, summarize, join, render)
  • Create saved queries and functions
  • Create Azure Monitor Workbooks with visualizations
  • Configure diagnostic settings for Azure resources

Sysadmin ↔ Azure reference

On-Prem / TraditionalAzure Equivalent
Splunk / ELK Stack / GraylogLog Analytics workspace
rsyslog / syslog-ng / FluentdAzure Monitor Agent (AMA)
Log rotation / retention policiesWorkspace retention settings
SQL queries on log databasesKusto Query Language (KQL)
Grafana dashboardsAzure Monitor Workbooks
collectd / Telegraf config filesData Collection Rules (DCR)
Windows Event Viewer forwardingWindows Event Log collection via DCR

Setup

# Variables
RG="rg-az104-challenge27"
LOCATION="eastus"

# Create resource group
az group create --name $RG --location $LOCATION

Tasks

Task 1: create a Log Analytics workspace

# Create Log Analytics workspace
az monitor log-analytics workspace create \
--resource-group $RG \
--workspace-name law-contoso-ops \
--location $LOCATION \
--retention-time 30 \
--sku PerGB2018

# Verify workspace
az monitor log-analytics workspace show \
--resource-group $RG \
--workspace-name law-contoso-ops \
--query "{Name:name, SKU:sku.name, Retention:retentionInDays, DailyCapGB:workspaceCapping.dailyQuotaGb}" -o table

# Get workspace ID and key (needed for agent configuration)
WORKSPACE_ID=$(az monitor log-analytics workspace show \
--resource-group $RG \
--workspace-name law-contoso-ops \
--query "customerId" -o tsv)

echo "Workspace ID: $WORKSPACE_ID"
Workspace Pricing

The PerGB2018 SKU charges per GB ingested. For the exam, know these options:

  • Free tier: 500 MB/day limit, 7-day retention
  • Per-GB: Pay per GB ingested, 30-730 day configurable retention
  • Commitment tiers: 100/200/300/400/500 GB/day for discounts
  • Daily cap: Can set a daily ingestion cap to control costs

Task 2: deploy target VMs for monitoring

# Create a VNet
az network vnet create \
--resource-group $RG \
--name vnet-monitored \
--address-prefix 10.0.0.0/16 \
--subnet-name subnet-vms \
--subnet-prefix 10.0.1.0/24

# Create a Linux VM
az vm create \
--resource-group $RG \
--name vm-linux-web \
--image Ubuntu2204 \
--size Standard_B1s \
--vnet-name vnet-monitored \
--subnet subnet-vms \
--public-ip-address vm-linux-pip \
--admin-username azureuser \
--generate-ssh-keys

# Create a Windows VM
az vm create \
--resource-group $RG \
--name vm-win-app \
--image Win2022Datacenter \
--size Standard_B2s \
--vnet-name vnet-monitored \
--subnet subnet-vms \
--public-ip-address vm-win-pip \
--admin-username azureuser \
--admin-password 'C0nt0so!Pass2024'

# Install a web server on Linux to generate logs
az vm run-command invoke \
--resource-group $RG \
--name vm-linux-web \
--command-id RunShellScript \
--scripts "sudo apt-get update && sudo apt-get install -y nginx && sudo systemctl start nginx"

Task 3: create Data collection rules (dcr)

# Get workspace resource ID
WORKSPACE_RESOURCE_ID=$(az monitor log-analytics workspace show \
--resource-group $RG \
--workspace-name law-contoso-ops \
--query "id" -o tsv)

# Create a DCR for Linux performance and syslog
az monitor data-collection rule create \
--resource-group $RG \
--name dcr-linux-perf-syslog \
--location $LOCATION \
--data-flows '[{"streams":["Microsoft-Perf","Microsoft-Syslog"],"destinations":["law-destination"]}]' \
--log-analytics "[{\"name\":\"law-destination\",\"workspace-resource-id\":\"$WORKSPACE_RESOURCE_ID\"}]" \
--performance-counters '[{"name":"perfCounters","streams":["Microsoft-Perf"],"sampling-frequency":60,"counter-specifiers":["\\Processor(*)\\% Processor Time","\\Memory\\Available Bytes","\\LogicalDisk(*)\\% Free Space","\\Network(*)\\Total Bytes Transmitted"]}]' \
--syslog '[{"name":"syslogCollection","streams":["Microsoft-Syslog"],"facility-names":["auth","authpriv","daemon","kern","syslog"],"log-levels":["Warning","Error","Critical","Alert","Emergency"]}]'

# Create a DCR for Windows events and performance
az monitor data-collection rule create \
--resource-group $RG \
--name dcr-windows-events \
--location $LOCATION \
--data-flows '[{"streams":["Microsoft-Perf","Microsoft-Event"],"destinations":["law-destination"]}]' \
--log-analytics "[{\"name\":\"law-destination\",\"workspace-resource-id\":\"$WORKSPACE_RESOURCE_ID\"}]" \
--performance-counters '[{"name":"winPerfCounters","streams":["Microsoft-Perf"],"sampling-frequency":60,"counter-specifiers":["\\Processor(*)\\% Processor Time","\\Memory\\% Committed Bytes In Use","\\LogicalDisk(*)\\% Free Space"]}]' \
--windows-event-logs '[{"name":"winEvents","streams":["Microsoft-Event"],"x-path-queries":["Application!*[System[(Level=1 or Level=2 or Level=3)]]","System!*[System[(Level=1 or Level=2 or Level=3)]]","Security!*[System[(band(Keywords,13510798882111488))]]"]}]'

# List DCRs
az monitor data-collection rule list --resource-group $RG -o table

Task 4: install Azure Monitor agent and associate DCRs

# Install Azure Monitor agent on Linux VM
az vm extension set \
--resource-group $RG \
--vm-name vm-linux-web \
--name AzureMonitorLinuxAgent \
--publisher Microsoft.Azure.Monitor \
--version 1.0 \
--enable-auto-upgrade true

# Install Azure Monitor agent on Windows VM
az vm extension set \
--resource-group $RG \
--vm-name vm-win-app \
--name AzureMonitorWindowsAgent \
--publisher Microsoft.Azure.Monitor \
--version 1.0 \
--enable-auto-upgrade true

# Associate Linux DCR with the Linux VM
LINUX_VM_ID=$(az vm show -g $RG -n vm-linux-web --query "id" -o tsv)
DCR_LINUX_ID=$(az monitor data-collection rule show \
--resource-group $RG \
--name dcr-linux-perf-syslog \
--query "id" -o tsv)

az monitor data-collection rule association create \
--name "linux-dcr-association" \
--resource $LINUX_VM_ID \
--rule-id $DCR_LINUX_ID

# Associate Windows DCR with the Windows VM
WIN_VM_ID=$(az vm show -g $RG -n vm-win-app --query "id" -o tsv)
DCR_WIN_ID=$(az monitor data-collection rule show \
--resource-group $RG \
--name dcr-windows-events \
--query "id" -o tsv)

az monitor data-collection rule association create \
--name "windows-dcr-association" \
--resource $WIN_VM_ID \
--rule-id $DCR_WIN_ID

# Verify associations
az monitor data-collection rule association list --resource $LINUX_VM_ID -o table
az monitor data-collection rule association list --resource $WIN_VM_ID -o table
Azure Monitor Agent vs Legacy Agents

AMA replaces the legacy Log Analytics agent (MMA/OMS) and Diagnostics extension:

  • AMA: Uses Data Collection Rules (DCR), supports multi-homing, uses managed identity
  • Legacy MMA: Uses workspace configuration, being deprecated
  • For the AZ-104 exam, focus on AMA + DCR (the modern approach)

Task 5: configure diagnostic settings for Azure resources

# Enable diagnostic settings for the VNet (sending to Log analytics)
VNET_ID=$(az network vnet show -g $RG -n vnet-monitored --query "id" -o tsv)

az monitor diagnostic-settings create \
--name "vnet-diagnostics" \
--resource $VNET_ID \
--workspace $WORKSPACE_RESOURCE_ID \
--metrics '[{"category":"AllMetrics","enabled":true}]'

# Enable diagnostic settings for NSG (if exists)
# Create a storage account for archival
DIAG_STORAGE="diagstorage$RANDOM"
az storage account create \
--resource-group $RG \
--name $DIAG_STORAGE \
--sku Standard_LRS

# List available diagnostic categories for a resource type
az monitor diagnostic-settings categories list \
--resource $VNET_ID -o table

Portal Steps:

  1. Navigate to any Azure resource > Diagnostic settings
  2. Click Add diagnostic setting
  3. Select log categories and metrics to collect
  4. Choose destinations: Log Analytics workspace, Storage account, Event Hub
  5. Click Save

Task 6: write KQL queries

KQL Basics

Wait 10-15 minutes after configuring data collection for logs to appear. Use the Portal Log Analytics query editor for interactive testing.

Portal: Navigate to Log Analytics workspace > Logs

// Basic query: Find all heartbeat records from the last hour
Heartbeat
| where TimeGenerated > ago(1h)
| project Computer, TimeGenerated, OSType, Version

// Filter (where): Find errors in syslog
Syslog
| where TimeGenerated > ago(24h)
| where SeverityLevel in ("err", "crit", "alert", "emerg")
| project TimeGenerated, Computer, Facility, SeverityLevel, SyslogMessage
| order by TimeGenerated desc

// Summarize: Count events by severity
Syslog
| where TimeGenerated > ago(24h)
| summarize Count=count() by SeverityLevel
| order by Count desc

// Summarize with time bins: CPU usage over time
Perf
| where TimeGenerated > ago(1h)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| where InstanceName == "_Total"
| summarize AvgCPU=avg(CounterValue) by bin(TimeGenerated, 5m), Computer
| order by TimeGenerated asc

// Join: Correlate performance with events
Perf
| where TimeGenerated > ago(1h)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| where InstanceName == "_Total"
| summarize AvgCPU=avg(CounterValue) by bin(TimeGenerated, 5m), Computer
| join kind=leftouter (
Syslog
| where TimeGenerated > ago(1h)
| summarize ErrorCount=count() by bin(TimeGenerated, 5m), Computer
) on TimeGenerated, Computer
| project TimeGenerated, Computer, AvgCPU, ErrorCount=coalesce(ErrorCount, 0)

// Render: Create a time chart
Perf
| where TimeGenerated > ago(1h)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| where InstanceName == "_Total"
| summarize AvgCPU=avg(CounterValue) by bin(TimeGenerated, 5m), Computer
| render timechart

// Advanced: Find VMs with high memory usage
Perf
| where TimeGenerated > ago(1h)
| where ObjectName == "Memory"
| where CounterName == "Available Bytes" or CounterName == "% Committed Bytes In Use"
| summarize AvgValue=avg(CounterValue) by Computer, CounterName
| evaluate pivot(CounterName, any(AvgValue))

Task 7: create saved queries and Functions

# Save a query via the portal:
# 1. run the query in Log Analytics > logs
# 2. click "Save" > "Save as query"
# 3. name: "High CPU VMs", category: "Performance"

# Create a function (reusable query) via CLI
az monitor log-analytics workspace saved-search create \
--resource-group $RG \
--workspace-name law-contoso-ops \
--name "HighCPUAlerts" \
--display-name "High CPU VMs" \
--category "Performance" \
--saved-query "Perf | where ObjectName == 'Processor' and CounterName == '% Processor Time' and InstanceName == '_Total' | where CounterValue > 80 | summarize AvgCPU=avg(CounterValue) by Computer, bin(TimeGenerated, 5m) | where AvgCPU > 80"

# List saved queries
az monitor log-analytics workspace saved-search list \
--resource-group $RG \
--workspace-name law-contoso-ops -o table

Task 8: create a workbook with visualizations

Portal Steps (Workbooks require Portal):

  1. Navigate to Azure Monitor > Workbooks
  2. Click New
  3. Add the following elements:

Section 1 | VM Health Overview (Grid):

Heartbeat
| where TimeGenerated > ago(5m)
| summarize LastHeartbeat=max(TimeGenerated) by Computer, OSType
| extend Status = iff(LastHeartbeat > ago(5m), "Healthy", "Unhealthy")
| project Computer, OSType, LastHeartbeat, Status

Section 2 | CPU Usage Over Time (Line Chart):

Perf
| where TimeGenerated > ago(4h)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| where InstanceName == "_Total"
| summarize AvgCPU=avg(CounterValue) by bin(TimeGenerated, 5m), Computer
| render timechart

Section 3 | Error Summary (Pie Chart):

Syslog
| where TimeGenerated > ago(24h)
| where SeverityLevel in ("err", "crit", "alert", "emerg")
| summarize Count=count() by Facility
| render piechart

Section 4 | Top Talkers (Bar Chart):

Perf
| where TimeGenerated > ago(1h)
| where ObjectName == "Network" and CounterName == "Total Bytes Transmitted"
| summarize TotalBytes=sum(CounterValue) by Computer
| top 10 by TotalBytes desc
| render barchart
  1. Click Save and name the workbook "Contoso Operations Dashboard"

Task 9: configure workspace settings

# Set daily ingestion cap (cost control)
az monitor log-analytics workspace update \
--resource-group $RG \
--workspace-name law-contoso-ops \
--quota 1

# Update retention period
az monitor log-analytics workspace update \
--resource-group $RG \
--workspace-name law-contoso-ops \
--retention-time 60

# Configure table-level retention (different retention per table)
# Some tables may need longer retention for compliance
az monitor log-analytics workspace table update \
--resource-group $RG \
--workspace-name law-contoso-ops \
--name Syslog \
--retention-time 90

# Show workspace configuration
az monitor log-analytics workspace show \
--resource-group $RG \
--workspace-name law-contoso-ops \
--query "{Name:name, Retention:retentionInDays, DailyCapGB:workspaceCapping.dailyQuotaGb}" -o table

Success criteria

  • Log Analytics workspace created with appropriate SKU and retention
  • Data Collection Rules created for Linux (perf + syslog) and Windows (perf + events)
  • Azure Monitor Agent installed on both Linux and Windows VMs
  • DCR associations created (VMs linked to their respective DCRs)
  • Diagnostic settings configured for Azure resources
  • KQL queries written and tested (where, summarize, join, render)
  • Saved queries or functions created
  • Workbook created with multiple visualizations
  • Workspace daily cap and retention configured

Break & fix scenarios

Scenario a: no Data appearing in Log Analytics

# Check if AMA extension is installed and healthy
az vm extension list --resource-group $RG --vm-name vm-linux-web -o table

# Check if DCR association exists
az monitor data-collection rule association list \
--resource $LINUX_VM_ID -o table

# Check DCR configuration
az monitor data-collection rule show \
--resource-group $RG \
--name dcr-linux-perf-syslog

# Common causes:
# 1. AMA extension not installed or failed
# 2. DCR not associated with the VM
# 3. workspace ID mismatch in DCR
# 4. wait time (data takes 5-15 minutes to appear)

Scenario b: KQL query returns no results

// Common mistake: Wrong table name
// "Perf" not "PerformanceCounters" or "perf"

// Common mistake: Wrong time range
// Use ago(1h), ago(24h), not specific dates that may be in the future

// Debug: Check what tables have data
search *
| where TimeGenerated > ago(1h)
| summarize Count=count() by $table
| order by Count desc

Scenario c: daily cap reached

# Symptom: Data stops flowing into workspace
# Check current usage
az monitor log-analytics workspace show \
--resource-group $RG \
--workspace-name law-contoso-ops \
--query "workspaceCapping"

# Fix: increase or remove the daily cap
az monitor log-analytics workspace update \
--resource-group $RG \
--workspace-name law-contoso-ops \
--quota -1
# (-1 removes the cap)

Knowledge check

1. What is the difference between Data Collection Rules and Diagnostic Settings?

Show Answer
FeatureData Collection Rules (DCR)Diagnostic Settings
SourceVMs (via AMA agent)Azure platform resources
Data typesPerf counters, logs, customPlatform metrics, resource logs
Agent requiredYes (AMA)No (built-in)
ConfigurationCentralized, reusablePer-resource
FilteringYes (at collection time)Category-based

Use DCRs for VM/compute data. Use Diagnostic Settings for PaaS/platform data.

2. What are the key KQL operators for the AZ-104 exam?

Show Answer
OperatorPurposeExample
whereFilter rowswhere CounterValue > 80
projectSelect columnsproject Computer, TimeGenerated
summarizeAggregatesummarize avg(CounterValue) by Computer
bin()Time bucketingbin(TimeGenerated, 5m)
joinCombine tablesTable1 | join Table2 on Column
renderVisualizerender timechart
extendAdd calculated columnextend GB = Bytes / 1073741824
order bySortorder by Count desc
topTake N highesttop 10 by Value desc
countCount rowscount

3. What is the difference between Azure Monitor Agent (AMA) and the legacy Log Analytics Agent (MMA)?

Show Answer
FeatureAMA (New)MMA/OMS (Legacy)
ConfigurationData Collection RulesWorkspace settings
Multi-homingNative (multiple DCRs)Limited
AuthenticationManaged IdentityWorkspace key
FilteringAt source (DCR)At workspace
StatusCurrent, recommendedDeprecated (Aug 2024)
Extension nameAzureMonitorLinuxAgent / AzureMonitorWindowsAgentOmsAgentForLinux / MicrosoftMonitoringAgent

4. How does workspace retention work?

Show Answer
  • Default retention: 30 days (included in Per-GB price)
  • Configurable: 30-730 days (additional cost beyond 30 days)
  • Table-level retention: Override workspace default per table
  • Archive tier: Data older than retention is moved to archive (cheaper, requires restore to query)
  • Interactive retention: Data queryable immediately
  • Compliance: Some regulations require 1+ year retention

Cleanup

# Delete all resources
az group delete --name $RG --yes --no-wait

echo "Resources are being deleted in the background."

Learning resources