Challenge 48: GitHub monitoring and alerts
Exam skills covered
- Configure monitoring in GitHub, including enabling insights and creating and configuring charts
- Configure alerts for events in GitHub Actions and Azure Pipelines
Scenario
Contoso Ltd's engineering manager wants visibility into team velocity, workflow efficiency, and deployment patterns without leaving GitHub. Currently, no one knows the average CI build time, which workflows fail most often, or how frequently teams deploy. The manager also wants proactive alerts when critical workflows fail so the team does not discover broken builds hours later. You must configure GitHub monitoring and alerting to provide actionable engineering metrics.
Prerequisites
- GitHub organization with multiple repositories
- GitHub Actions workflows that have run at least a few times (for historical data)
- GitHub Projects board
- GitHub CLI installed and authenticated
- A Slack workspace or Microsoft Teams channel (for alert notifications)
Tasks
Task 1: Enable and explore GitHub repository insights
GitHub provides built-in repository insights for traffic, contributions, and community health.
# View repository traffic (requires push access)
gh api repos/contoso/webapp/traffic/views --jq '{
totalViews: .count,
uniqueVisitors: .uniques,
daily: [.views[] | {date: .timestamp, views: .count, unique: .uniques}]
}'
# View clone statistics
gh api repos/contoso/webapp/traffic/clones --jq '{
totalClones: .count,
uniqueCloners: .uniques,
daily: [.clones[] | {date: .timestamp, clones: .count, unique: .uniques}]
}'
# View top referral sources
gh api repos/contoso/webapp/traffic/popular/referrers --jq '.[] | {referrer, count, uniques}'
# View popular content paths
gh api repos/contoso/webapp/traffic/popular/paths --jq '.[] | {path, title, count, uniques}'
# View contributor statistics
gh api repos/contoso/webapp/stats/contributors --jq '.[] | {
author: .author.login,
totalCommits: .total,
lastWeekCommits: (.weeks[-1].c)
}'
# View commit activity (weekly)
gh api repos/contoso/webapp/stats/commit_activity --jq '.[-4:] | .[] | {
week: (.week | todate),
totalCommits: .total,
dailyBreakdown: .days
}'
Insights available in the GitHub UI (repository > Insights tab):
- Pulse: recent activity summary
- Contributors: commit frequency per contributor
- Community: community health files (README, CONTRIBUTING, CODE_OF_CONDUCT)
- Traffic: page views, clones, referrers
- Commits: commit frequency over time
- Code frequency: additions and deletions per week
- Dependency graph: dependencies and dependents
- Network: fork network visualization
- Forks: list of forks with activity
Task 2: Configure GitHub Actions workflow insights
# Get workflow run statistics
gh api repos/contoso/webapp/actions/workflows --jq '.workflows[] | {
name: .name,
id: .id,
state: .state
}'
# Get recent runs for a specific workflow with timing
gh run list --workflow deploy.yml --limit 20 --json status,conclusion,startedAt,updatedAt \
--jq '.[] | {
status,
conclusion,
started: .startedAt,
duration: ((.updatedAt | fromdateiso8601) - (.startedAt | fromdateiso8601) | tostring + "s")
}'
# Calculate success rate for the last 100 runs
gh run list --workflow deploy.yml --limit 100 --json conclusion \
--jq '{
total: length,
success: [.[] | select(.conclusion == "success")] | length,
failure: [.[] | select(.conclusion == "failure")] | length,
cancelled: [.[] | select(.conclusion == "cancelled")] | length,
successRate: (([.[] | select(.conclusion == "success")] | length) * 100 / length | tostring + "%")
}'
# Get average workflow duration (last 50 successful runs)
gh run list --workflow deploy.yml --limit 50 --status completed --json startedAt,updatedAt,conclusion \
--jq '[.[] | select(.conclusion == "success") | ((.updatedAt | fromdateiso8601) - (.startedAt | fromdateiso8601))] | (add / length | floor | tostring + " seconds average")'
# Usage minutes consumed
gh api orgs/contoso/settings/billing/actions --jq '{
totalMinutesUsed: .total_minutes_used,
includedMinutes: .included_minutes,
paidMinutesUsed: .total_paid_minutes_used
}'
Task 3: Create custom charts in GitHub Projects
GitHub Projects (v2) support custom charts for tracking work items.
# List projects in the organization
gh api graphql -f query='
{
organization(login: "contoso") {
projectsV2(first: 10) {
nodes {
id
title
number
}
}
}
}' --jq '.data.organization.projectsV2.nodes[]'
To create charts in GitHub Projects:
- Navigate to the Project board
- Click the "Insights" tab (chart icon)
- Create charts:
Chart 1: Burn-down chart
- Type: Line chart
- X-axis: Time
- Y-axis: Count of items
- Filter: Status != Done
- Group by: None
Chart 2: Items by assignee
- Type: Bar chart
- X-axis: Assignee
- Y-axis: Count
- Filter: Status = In Progress
- Group by: Priority
Chart 3: Cycle time
- Type: Line chart
- X-axis: Closed date
- Y-axis: Duration (days from created to closed)
- Filter: Status = Done
Chart 4: Distribution by label
- Type: Pie chart
- Group by: Label
- Filter: Status != Done
Task 4: Set up workflow failure notifications
# .github/workflows/notify-on-failure.yml
name: CI Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm ci
- run: npm test
- run: npm run build
notify-failure:
runs-on: ubuntu-latest
needs: [build]
if: failure()
steps:
- name: Send Slack notification
uses: slackapi/slack-github-action@v1.27.0
with:
payload: |
{
"text": "CI Pipeline Failed",
"blocks": [
{
"type": "header",
"text": {
"type": "plain_text",
"text": "Pipeline Failure: ${{ github.workflow }}"
}
},
{
"type": "section",
"fields": [
{"type": "mrkdwn", "text": "*Repository:*\n${{ github.repository }}"},
{"type": "mrkdwn", "text": "*Branch:*\n${{ github.ref_name }}"},
{"type": "mrkdwn", "text": "*Commit:*\n${{ github.sha }}"},
{"type": "mrkdwn", "text": "*Author:*\n${{ github.actor }}"}
]
},
{
"type": "actions",
"elements": [
{
"type": "button",
"text": {"type": "plain_text", "text": "View Run"},
"url": "${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
}
]
}
]
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK
- name: Send Teams notification
run: |
curl -X POST "${{ secrets.TEAMS_WEBHOOK_URL }}" \
-H "Content-Type: application/json" \
--data '{
"@type": "MessageCard",
"summary": "CI Pipeline Failed",
"themeColor": "FF0000",
"title": "Pipeline Failure: ${{ github.workflow }}",
"sections": [{
"facts": [
{"name": "Repository", "value": "${{ github.repository }}"},
{"name": "Branch", "value": "${{ github.ref_name }}"},
{"name": "Author", "value": "${{ github.actor }}"},
{"name": "Commit", "value": "${{ github.sha }}"}
]
}],
"potentialAction": [{
"@type": "OpenUri",
"name": "View Run",
"targets": [{"os": "default", "uri": "${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"}]
}]
}'
Task 5: Configure branch deployment activity dashboard
# .github/workflows/deployment-tracker.yml
name: Track deployments
on:
deployment_status:
jobs:
track:
runs-on: ubuntu-latest
if: github.event.deployment_status.state == 'success'
steps:
- name: Record deployment metrics
run: |
echo "Deployment to ${{ github.event.deployment.environment }} succeeded"
echo "SHA: ${{ github.event.deployment.sha }}"
echo "Created: ${{ github.event.deployment.created_at }}"
# Post deployment frequency metric
gh api repos/${{ github.repository }}/deployments \
--jq '[.[] | select(.environment == "${{ github.event.deployment.environment }}")] | length' \
| xargs -I {} echo "Total deployments to ${{ github.event.deployment.environment }}: {}"
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Query deployment history:
# List recent deployments
gh api repos/contoso/webapp/deployments --jq '.[] | {
id: .id,
environment: .environment,
sha: .sha[:7],
creator: .creator.login,
created: .created_at,
description: .description
}' | head -20
# Deployment frequency (last 30 days)
gh api repos/contoso/webapp/deployments --jq '[
.[] | select((.created_at | fromdateiso8601) > (now - 2592000))
] | group_by(.environment) | .[] | {
environment: .[0].environment,
count: length,
frequency: (length / 30 | tostring + " per day")
}'
Task 6: Azure Pipelines alerts
Configure notifications in Azure DevOps:
# Azure DevOps notification settings (via web UI):
# 1. Project Settings > Notifications
# 2. Create subscription:
# - Category: Build
# - Event: A build completes > with status Failed
# - Deliver to: Team email or custom email
# - Filter: Definition name = "Production-Deploy"
# Alternatively, configure per-pipeline notifications via REST API:
curl -X POST \
"https://dev.azure.com/contoso/ContosoWeb/_apis/notification/subscriptions?api-version=7.1-preview.1" \
-H "Authorization: Basic $(echo -n :$PAT | base64)" \
-H "Content-Type: application/json" \
-d '{
"description": "Notify on production pipeline failure",
"filter": {
"type": "Expression",
"filterModel": {
"clauses": [{
"logicalOperator": "",
"fieldName": "Definition name",
"operator": "=",
"value": "Production-Deploy"
}]
}
},
"channel": {
"type": "EmailHtml"
},
"subscriber": {
"id": "ops-team@contoso.com"
}
}'
Pipeline retention warnings:
# azure-pipelines.yml - Warn before artifacts expire
schedules:
- cron: "0 9 * * 1"
displayName: Weekly retention audit
branches:
include:
- main
steps:
- task: PowerShell@2
inputs:
targetType: 'inline'
script: |
$headers = @{
Authorization = "Basic $([Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(":$(System.AccessToken)")))"
}
# Check builds with artifacts nearing retention limit
$builds = Invoke-RestMethod -Uri "https://dev.azure.com/contoso/ContosoWeb/_apis/build/builds?api-version=7.1&minTime=$((Get-Date).AddDays(-25).ToString('o'))&maxTime=$((Get-Date).AddDays(-20).ToString('o'))" -Headers $headers
if ($builds.count -gt 0) {
Write-Host "##vso[task.logissue type=warning]$($builds.count) builds have artifacts expiring within 5 days"
}
Task 7: Build a workflow that sends weekly metrics digest
# .github/workflows/weekly-metrics.yml
name: Weekly engineering metrics
on:
schedule:
- cron: '0 9 * * 1' # Monday 9 AM UTC
workflow_dispatch:
jobs:
metrics:
runs-on: ubuntu-latest
steps:
- name: Collect metrics
id: metrics
run: |
# Workflow success rates
DEPLOY_RUNS=$(gh run list --workflow deploy.yml --limit 50 --json conclusion)
DEPLOY_SUCCESS=$(echo "$DEPLOY_RUNS" | jq '[.[] | select(.conclusion == "success")] | length')
DEPLOY_TOTAL=$(echo "$DEPLOY_RUNS" | jq 'length')
DEPLOY_RATE=$(echo "scale=1; $DEPLOY_SUCCESS * 100 / $DEPLOY_TOTAL" | bc)
CI_RUNS=$(gh run list --workflow ci.yml --limit 50 --json conclusion)
CI_SUCCESS=$(echo "$CI_RUNS" | jq '[.[] | select(.conclusion == "success")] | length')
CI_TOTAL=$(echo "$CI_RUNS" | jq 'length')
CI_RATE=$(echo "scale=1; $CI_SUCCESS * 100 / $CI_TOTAL" | bc)
# PR metrics
PRS_MERGED=$(gh pr list --state merged --limit 100 --json mergedAt \
--jq '[.[] | select((.mergedAt | fromdateiso8601) > (now - 604800))] | length')
PRS_OPEN=$(gh pr list --state open --json number --jq 'length')
# Issues
ISSUES_CLOSED=$(gh issue list --state closed --limit 100 --json closedAt \
--jq '[.[] | select((.closedAt | fromdateiso8601) > (now - 604800))] | length')
ISSUES_OPEN=$(gh issue list --state open --json number --jq 'length')
# Store metrics
echo "deploy-rate=$DEPLOY_RATE" >> $GITHUB_OUTPUT
echo "ci-rate=$CI_RATE" >> $GITHUB_OUTPUT
echo "prs-merged=$PRS_MERGED" >> $GITHUB_OUTPUT
echo "prs-open=$PRS_OPEN" >> $GITHUB_OUTPUT
echo "issues-closed=$ISSUES_CLOSED" >> $GITHUB_OUTPUT
echo "issues-open=$ISSUES_OPEN" >> $GITHUB_OUTPUT
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Send digest to Slack
uses: slackapi/slack-github-action@v1.27.0
with:
payload: |
{
"blocks": [
{
"type": "header",
"text": {"type": "plain_text", "text": "Weekly Engineering Metrics - ${{ github.repository }}"}
},
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "*Deployment Pipeline:* ${{ steps.metrics.outputs.deploy-rate }}% success rate\n*CI Pipeline:* ${{ steps.metrics.outputs.ci-rate }}% success rate\n*PRs Merged:* ${{ steps.metrics.outputs.prs-merged }} this week\n*PRs Open:* ${{ steps.metrics.outputs.prs-open }}\n*Issues Closed:* ${{ steps.metrics.outputs.issues-closed }} this week\n*Issues Open:* ${{ steps.metrics.outputs.issues-open }}"
}
}
]
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK
## Break and fix
### Break scenario 1: Workflow insights show 0% success rate but pipelines are passing
The GitHub Actions insights page shows failures for a workflow, but developers confirm the builds are green.
**Cause:** The workflow has a required job that is being skipped due to a path filter or conditional. Skipped jobs are counted as neutral, but the overall workflow conclusion may show as "failure" if a downstream job fails because a dependency was skipped.
**Diagnosis:**
```bash
gh run list --workflow ci.yml --limit 10 --json conclusion,status \
--jq '.[] | {conclusion, status}'
Show solution
Fix: Ensure that conditional jobs handle skip conditions correctly:
jobs:
test:
if: always()
needs: [build]
# Use outcome check instead of default success() which fails on skip
steps:
- run: echo "Tests running"
if: needs.build.result == 'success'
Break scenario 2: Slack notifications not being delivered
The notify-on-failure job runs successfully but no Slack message appears.
Cause: The Slack webhook URL has expired or the webhook was deleted from the Slack workspace.
Diagnosis:
# Test the webhook directly
curl -X POST "$SLACK_WEBHOOK_URL" \
-H "Content-Type: application/json" \
-d '{"text": "test message"}'
# If response is "invalid_token" or "channel_not_found", the webhook is broken
Show solution
Fix: Generate a new webhook URL in Slack (Apps > Incoming Webhooks > Add new) and update the repository secret:
gh secret set SLACK_WEBHOOK_URL --body "https://hooks.slack.com/services/NEW/WEBHOOK/URL"
Knowledge check
1. Contoso's engineering manager wants to know the deployment frequency for the last 30 days without leaving the terminal. Which approach provides this data?
2. A GitHub Actions workflow needs to send a notification to Microsoft Teams only when the production deployment job fails. What is the correct approach?
3. Contoso wants to track sprint progress with burn-down charts and cycle time metrics. Where should they configure these visualizations?
4. An Azure DevOps pipeline should send a notification when a build fails and also when a build succeeds after a previous failure (recovery notification). How should this be configured?
Cleanup
# Remove workflow files
rm -f .github/workflows/notify-on-failure.yml
rm -f .github/workflows/deployment-tracker.yml
rm -f .github/workflows/weekly-metrics.yml
# Remove notification subscriptions in Azure DevOps
# Project Settings > Notifications > Delete custom subscriptions
# Remove Slack webhook secret
gh secret delete SLACK_WEBHOOK_URL
gh secret delete TEAMS_WEBHOOK_URL
git add -A && git commit -m "cleanup: remove challenge 48 monitoring workflows" && git push