Challenge 32: design high availability for Non-Relational Data
60-90 min | Estimated cost: $15-30 | Exam Weight: 15-20%
Introduction
BattleForge Games is a global mobile gaming company with 25 million daily active players across North America, Europe, and Asia-Pacific. Their flagship game stores player profiles, inventory, progression data, and real-time match state in Azure Cosmos DB (NoSQL API) with multi-region writes. Game assets (textures, audio, 3D models totaling 5 TB) are served from Azure Blob Storage through Azure CDN for fast loading.
The gaming industry demands extreme availability: if players cannot access their profiles or game assets, they switch to a competitor within minutes. BattleForge requires that player profiles be writable from any region with less than 100ms latency, game assets must be available even if an entire Azure region goes down, and match state updates must be consistent across all players in a match (regardless of their geographic location).
The primary technical challenge is balancing consistency vs. availability in Cosmos DB. Multi-region writes provide the lowest latency but introduce conflict resolution complexity. The storage layer must provide continuous access to 5 TB of game assets even during regional failures, without requiring players to experience loading delays. BattleForge has a budget of $8,000/month for their data tier (excluding compute).
Exam skills covered
- Recommend a high availability solution for semi-structured and unstructured data
Design tasks
Part 1: Cosmos DB Multi-Region write configuration
-
Design the Cosmos DB deployment for player profiles:
- Account deployed to 3 regions: East US, West Europe, Japan East
- Multi-region writes enabled (players write to nearest region)
- Document the conflict resolution policy options:
- Last Writer Wins (LWW) - automatic, uses timestamp
- Custom conflict resolution - stored procedure
- Which is appropriate for player profiles?
-
Evaluate the five Cosmos DB consistency levels and select the appropriate one for each workload:
| Player Profiles | Match State | Leaderboards | |
|---|---|---|---|
| Strong | ? | ? | ? |
| Bounded Staleness | ? | ? | ? |
| Session | ? | ? | ? |
| Consistent Prefix | ? | ? | ? |
| Eventual | ? | ? | ? |
-
Justify your consistency choice considering:
- Session consistency for player profiles: player sees their own writes immediately, others see eventually
- Strong consistency for match state: all players must see the same game state
- Limitation: Strong consistency is NOT available with multi-region writes
- What alternative achieves match consistency without strong consistency?
-
Configure the Cosmos DB account with multi-region writes:
# Create Cosmos DB account with multi-region writes
az cosmosdb create \
--resource-group rg-battleforge \
--name cosmos-battleforge \
--locations regionName=eastus failoverPriority=0 isZoneRedundant=true \
--locations regionName=westeurope failoverPriority=1 isZoneRedundant=true \
--locations regionName=japaneast failoverPriority=2 isZoneRedundant=true \
--enable-multiple-write-locations true \
--default-consistency-level Session
Part 2: Cosmos DB availability and failover
-
Analyze the availability characteristics of the multi-region write configuration:
- What SLA does multi-region write Cosmos DB provide? (99.999% read and write)
- What happens when one region fails? (Other regions continue serving reads AND writes)
- How does zone redundancy within each region add further protection?
-
Compare single-region write vs. multi-region write for the match state use case:
| Aspect | Single-Region Write | Multi-Region Write |
|---|---|---|
| Write latency from remote regions | High (cross-region round-trip) | Low (local write) |
| Conflict resolution | No conflicts | Must handle conflicts |
| Consistency options | All 5 levels including Strong | Strong NOT available |
| Write availability during region failure | Failover required (seconds) | Automatic (other regions continue) |
| Cost | Lower (no write replication fee) | Higher (multi-master RU charges) |
- Design the match state architecture considering the strong consistency limitation:
- Option A: Single-region write with bounded staleness (low conflict, predictable lag)
- Option B: Multi-region writes with custom conflict resolution (complex but fastest)
- Option C: Use a different service for match state (e.g., Azure SignalR for real-time sync)
- Recommend and justify your choice
Part 3: Storage account redundancy for game assets
- Design the storage redundancy for 5 TB of game assets across these options:
| Redundancy | Copies | Regions | Read During Outage | Cost Multiplier |
|---|---|---|---|---|
| LRS | 3 in 1 zone | 1 | No | 1x |
| ZRS | 3 across zones | 1 | Zone failure: Yes | ~1.25x |
| GRS | 6 (3+3) | 2 | No (write failover needed) | ~2x |
| GZRS | 6 (3 ZRS + 3 LRS) | 2 | Zone failure: Yes | ~2.25x |
| RA-GRS | 6 (3+3) | 2 | Yes (secondary read-only) | ~2x + read ops |
| RA-GZRS | 6 (3 ZRS + 3 LRS) | 2 | Yes (zone + region) | ~2.5x |
-
Select the appropriate redundancy for game assets considering:
- Assets must be available even if a full region fails
- Read access is needed immediately (cannot wait for failover)
- RA-GZRS provides the highest availability but at highest cost
- Is RA-GRS sufficient given CDN caching covers most read scenarios?
-
Configure the storage account with the selected redundancy and the CDN for global distribution:
# Create storage account with RA-GRS (cdn handles zone-level caching)
az storage account create \
--resource-group rg-battleforge \
--name stbattleforgeassets \
--location eastus \
--sku Standard_RAGRS \
--kind StorageV2 \
--access-tier Hot
Part 4: CDN and edge availability
-
Design the CDN architecture for game asset delivery:
- Azure CDN (or Azure Front Door with caching) as the primary delivery mechanism
- Configure cache rules: game assets are immutable (versioned URLs), cache for 30 days
- Origin failover: if primary storage is unavailable, CDN serves from cache or secondary origin
- Calculate: with 30-day cache TTL and 5 TB of assets, what percentage is typically cached at edge?
-
Design the fallback strategy when CDN cache misses during a primary region outage:
- Configure CDN origin group with primary (East US) and secondary (RA-GRS secondary endpoint)
- Health probe on origin to detect failure
- Automatic origin failover within CDN configuration
-
Calculate the total monthly cost for the data tier:
- Cosmos DB: 3 regions, multi-region write, estimated RU consumption
- Storage: 5 TB with RA-GRS redundancy
- CDN: bandwidth costs for global delivery
- Verify total fits within $8,000/month budget
Success criteria
- ⬜Cosmos DB configured with multi-region writes and zone redundancy in each region
- ⬜Consistency level selected and justified for each workload (profiles, match state, leaderboards)
- ⬜Conflict resolution strategy designed for multi-region writes
- ⬜Storage account redundancy selected (RA-GRS or RA-GZRS) with justification
- ⬜CDN configured with origin failover for continuous asset delivery
- ⬜Total data tier cost estimated and validated against $8K/month budget
Hints
Hint 1: Cosmos DB Consistency and Multi-Region Writes
Critical limitation: Strong consistency is NOT available when multi-region writes are enabled. This is because strong consistency requires synchronous replication to all replicas before acknowledging a write, which is impractical across geographically distant regions (latency would be hundreds of milliseconds).
For multi-region write accounts, the highest available consistency is Bounded Staleness:
- Bounded Staleness: guarantees reads are no more than K versions or T seconds behind writes
- Session: guarantees a single client session sees its own writes (most popular choice)
- Consistent Prefix: guarantees reads never see out-of-order writes
- Eventual: no ordering guarantees, lowest latency
For player profiles: Session consistency is ideal (players see their own changes immediately). For match state: Consider a single-write-region approach with Strong consistency for the match database, or use an external coordination mechanism.
Hint 2: Cosmos DB Multi-Region Write Conflict Resolution
When two regions write to the same document simultaneously, a conflict occurs. Resolution options:
Last Writer Wins (LWW):
- Default policy, automatic
- Uses
_ts(timestamp) or a custom path to determine winner - Simpler but may lose data (losing write is discarded)
- Good for: player profiles where latest state is what matters
Custom conflict resolution (stored procedure):
- Your code decides how to merge conflicting writes
- Can implement custom merge logic (e.g., combine inventory changes)
- More complex but preserves both writes
- Good for: game inventory where both additions should be kept
Conflict feed (manual resolution):
- Conflicts are written to a conflict feed for application-level resolution
- Application reads and resolves conflicts asynchronously
- Most flexible but highest latency for resolution
For BattleForge player profiles: LWW with _ts is appropriate. If player updates their profile from two devices simultaneously, the latest update wins. For inventory, custom merge (combine both inventory changes) prevents item loss.
Hint 3: RA-GRS vs CDN for Asset Availability
Both provide read availability during outages, but serve different purposes:
RA-GRS (Read-Access Geo-Redundant Storage):
- Secondary endpoint always available for reads:
stbattleforgeassets-secondary.blob.core.windows.net - RPO: up to 15 minutes (async replication lag)
- No caching - every read goes to storage
- Full 5 TB available from secondary at all times
- Use as CDN origin failover, not as direct player endpoint
Azure CDN:
- Cached at global edge locations (150+ PoPs worldwide)
- Sub-50ms latency to most players globally
- Serves from cache even if origin is completely down (until TTL expires)
- With 30-day TTL and versioned URLs: 95%+ cache hit rate for game assets
- Missing assets (cache miss) need a healthy origin - this is where RA-GRS secondary helps
Recommended: CDN as primary delivery with RA-GRS secondary as failover origin.
Hint 4: Cosmos DB Pricing for Multi-Region
Cosmos DB multi-region write cost considerations:
- Write RU cost: billed per region that participates in writes (effectively multiplied by region count)
- Example: 10,000 RU/s provisioned, 3 write regions = 30,000 RU/s billed
- Read RUs: billed per region where reads occur
- Alternative: Use autoscale to avoid over-provisioning (max RU/s, pay for actual usage)
Cost estimate for BattleForge:
- Player profile operations: ~5,000 RU/s average (peaks to 15,000 during events)
- 3 write regions: 15,000 RU/s base provisioned
- At $0.008 per 100 RU/s/hour: 15,000/100 x $0.008 x 730 hours = ~$876/month
- With autoscale (max 50,000 RU/s): billed at 10% of max when idle = $292/month base
Storage: $0.25/GB/month for data, replicated to 3 regions = $0.75/GB/month effective
Hint 5: Cosmos DB 99.999% SLA Requirements
To achieve the 99.999% SLA (5 nines, ~26 seconds downtime/year), ALL of the following must be configured:
- Multi-region writes enabled (distributes writes, no single point of failure)
- At least 2 regions configured (minimum for geo-redundancy)
- Zone redundancy enabled in each region (isZoneRedundant=true)
Without multi-region writes: SLA is 99.99% for reads, 99.99% for writes (with zone redundancy) With multi-region writes: SLA is 99.999% for both reads and writes
This is the highest SLA of any Azure database service. Compare:
- Azure SQL Business Critical zone-redundant: 99.995%
- Azure SQL General Purpose zone-redundant: 99.995%
- Cosmos DB single-region zone-redundant: 99.99%
- Cosmos DB multi-region multi-write: 99.999%
Learning resources
- Distribute data globally with Azure Cosmos DB
- Consistency levels in Azure Cosmos DB
- Conflict resolution in Azure Cosmos DB
- Azure Storage redundancy
- Azure CDN overview
- High availability for Azure Cosmos DB
Knowledge check
1. BattleForge needs all players in a multiplayer match to see the same game state. Why can't they use Strong consistency with multi-region writes, and what is the recommended alternative?
Strong consistency is not available when multi-region writes are enabled in Cosmos DB. Strong consistency requires synchronous acknowledgment from all replicas before completing a write, which creates unacceptable latency across geographically distant regions. The recommended alternative for match state is to use a single-write-region with Strong consistency for the match database specifically (match sessions are typically regional), or use Bounded Staleness with a tight staleness window (e.g., 5 seconds, 10 operations). Alternatively, use Azure SignalR Service for real-time state synchronization, with Cosmos DB only for persistence.
2. A Cosmos DB account with multi-region writes and zone redundancy provides 99.999% SLA. What does this mean in practical downtime terms, and what scenario could still cause unavailability?
99.999% availability means maximum 26 seconds of downtime per year (or ~2.6 seconds per month). This is achieved because writes can succeed in any of the configured regions - a full region failure simply means writes land in other regions. Scenarios that could still cause unavailability include: (1) Multiple regions failing simultaneously (extremely unlikely), (2) Azure-wide platform issues affecting the Cosmos DB control plane globally, (3) Client-side network issues (not covered by SLA), (4) Exceeding provisioned throughput causing 429 throttling errors (not a true availability failure but impacts users similarly). Proper RU capacity planning and autoscale mitigate scenario 4.
3. BattleForge uses RA-GRS for their 5 TB game asset storage. During a primary region outage, what is the maximum staleness of data players might read from the secondary?
Up to 15 minutes (but typically much less). RA-GRS replicates data asynchronously to the secondary region. Microsoft targets an RPO of 15 minutes (no SLA guarantee on exact lag). In practice, replication is usually seconds behind. For game assets that are written once and read many times (immutable, versioned files), this staleness is irrelevant - assets uploaded 15+ minutes ago are fully replicated. The only risk is very recently uploaded assets (new game update) not yet replicated. Mitigation: upload new assets at least 30 minutes before making them referenced by game clients, or use CDN cache warming.
4. Two players in different regions simultaneously purchase the same limited-edition item in BattleForge (only 1 available). With Last Writer Wins conflict resolution, what happens?
Both writes initially succeed locally (each player sees the purchase confirmed), but LWW conflict resolution will keep only the later timestamp, effectively "canceling" the other player's purchase after replication. This creates a poor user experience - one player thinks they bought the item but it later disappears. For inventory with limited quantities, LWW is inappropriate. Better options: (1) Use a single-write region for the inventory service (Strong consistency, serializable transactions prevent overselling), (2) Use custom conflict resolution with a stored procedure that checks quantity before resolving, (3) Use an external coordination mechanism (distributed lock via Redis) for limited-quantity operations.
Validation lab
Deploy a minimal proof-of-concept to validate your design:
- Create a resource group for this lab:
az group create --name rg-az305-challenge32 --location eastus
- Deploy a Cosmos DB account with multi-region writes enabled:
az cosmosdb create \
--resource-group rg-az305-challenge32 \
--name cosmos-challenge32-$RANDOM \
--locations regionName=eastus failoverPriority=0 isZoneRedundant=false \
--locations regionName=westus failoverPriority=1 isZoneRedundant=false \
--enable-multiple-write-locations true \
--default-consistency-level Session
- Create a database and container with a partition key:
COSMOS_NAME=$(az cosmosdb list --resource-group rg-az305-challenge32 --query "[0].name" -o tsv)
az cosmosdb sql database create \
--resource-group rg-az305-challenge32 \
--account-name $COSMOS_NAME \
--name gamedb
az cosmosdb sql container create \
--resource-group rg-az305-challenge32 \
--account-name $COSMOS_NAME \
--database-name gamedb \
--name profiles \
--partition-key-path "/userId" \
--throughput 400
- Verify multi-region write is enabled and regions are active:
az cosmosdb show \
--resource-group rg-az305-challenge32 \
--name $COSMOS_NAME \
--query "{MultiRegionWrites:enableMultipleWriteLocations, Regions:writeLocations[].locationName}" -o table
- Confirm the account exposes write endpoints in both regions:
az cosmosdb show \
--resource-group rg-az305-challenge32 \
--name $COSMOS_NAME \
--query "writeLocations[].[locationName, documentEndpoint]" -o table
This mini-deployment validates your design decisions with real Azure resources. It is optional but recommended.
Cleanup
az group delete --name rg-az305-challenge32 --yes --no-wait
Next: Challenge 33: Design a Highly Available Multi-Region Application