Desafio 12: Mono-repo vs multi-repo
Habilidades do exame
- Projetar uma estratégia para escalar e otimizar um repositório Git, incluindo Scalar e compartilhamento entre repositórios
Cenário
A Contoso Ltd opera 15 microsserviços que compõem sua plataforma de e-commerce: user-service, catalog-service, order-service, payment-service, shipping-service, notification-service, search-service, analytics-service, auth-service, inventory-service, review-service, recommendation-service, admin-portal, customer-portal e shared-libs. Algumas equipes defendem o mono-repo (refatoração entre serviços mais fácil, pipeline de CI único, alterações atômicas). Outras querem repositórios separados (propriedade clara, deploys independentes, tamanhos de clone menores). O repositório cresceu para 8GB com 5 anos de histórico e 50.000 commits. O tempo de clone é de 25 minutos. O CTO quer uma recomendação baseada em dados com detalhes de implementação para qualquer abordagem escolhida.
Tarefas
Tarefa 1: Vantagens e desvantagens do mono-repo
Documente os trade-offs para a situação específica da Contoso:
Análise de mono-repo para a plataforma de e-commerce da Contoso
Vantagens:
- Alterações atômicas entre serviços (renomear um tipo compartilhado, atualizar todos os 15 serviços em um único commit)
- Fonte única de verdade para bibliotecas compartilhadas (sem divergência de versão entre serviços)
- Configuração unificada de pipeline de CI/CD
- Descoberta de código mais fácil e colaboração entre equipes
- Ferramentas e linting consistentes em todos os serviços
- Gerenciamento simplificado de dependências (todos os serviços usam as mesmas versões)
- Refatoração entre fronteiras de serviços é simples
Desvantagens:
- Tamanho do repositório (8GB) torna o clone lento (25 min)
- Todos os 50 desenvolvedores acionam o CI a cada push (sem filtro por caminho)
- Granularidade de permissões é limitada (mais difícil restringir acesso por serviço)
- Ponto único de falha (indisponibilidade do repo afeta todas as equipes)
- Conflitos de merge em arquivos compartilhados (package.json, configuração de CI)
- Operações Git ficam lentas à medida que o histórico cresce
- Todas as equipes devem concordar com a estratégia de branching
Exemplo de estrutura de mono-repo:
contoso-platform/
├── services/
│ ├── user-service/
│ │ ├── src/
│ │ ├── tests/
│ │ ├── Dockerfile
│ │ └── package.json
│ ├── order-service/
│ │ ├── src/
│ │ ├── tests/
│ │ ├── Dockerfile
│ │ └── package.json
│ └── payment-service/
│ └── ...
├── libs/
│ ├── shared-types/
│ ├── common-utils/
│ └── auth-middleware/
├── infrastructure/
│ ├── terraform/
│ └── kubernetes/
├── tools/
│ ├── scripts/
│ └── generators/
├── .github/workflows/
├── package.json (workspace root)
└── nx.json (or turborepo.json)
Tarefa 2: Vantagens e desvantagens do multi-repo
Análise de multi-repo para a plataforma de e-commerce da Contoso
Vantagens:
- Fronteiras claras de propriedade (cada equipe possui seu repo)
- Ciclos de release e versionamento independentes
- Controle de acesso granular por repositório
- Repositórios menores são rápidos para clonar e operar
- Equipes podem escolher suas próprias ferramentas e linguagens
- Falhas são isoladas (problemas de CI de um repo não bloqueiam outros)
- Escala bem com o crescimento organizacional
Desvantagens:
- Alterações entre serviços exigem PRs coordenados entre repos
- Versionamento de bibliotecas compartilhadas cria problemas de dependência diamante
- Ferramentas e práticas inconsistentes entre repos
- Descoberta é mais difícil (onde esse serviço está?)
- Testes de integração exigem checkout de múltiplos repos
- Atualizações de dependências devem ser propagadas em cada repo separadamente
- Refatoração entre fronteiras de serviços é dolorosa
Exemplo de estrutura multi-repo:
# GitHub organization: contoso
contoso/user-service (team: identity)
contoso/catalog-service (team: catalog)
contoso/order-service (team: commerce)
contoso/payment-service (team: commerce)
contoso/shipping-service (team: fulfillment)
contoso/notification-service (team: platform)
contoso/search-service (team: catalog)
contoso/analytics-service (team: data)
contoso/auth-service (team: identity)
contoso/inventory-service (team: fulfillment)
contoso/review-service (team: catalog)
contoso/recommendation-service (team: data)
contoso/admin-portal (team: platform)
contoso/customer-portal (team: frontend)
contoso/shared-libs (team: platform)
Tarefa 3: Implementar Scalar para otimização de repositórios grandes
Scalar (mantido pela Microsoft, integrado ao Git desde a versão 2.38) otimiza o desempenho de repositórios grandes:
# Register the repository with Scalar (enables all optimizations)
scalar register
# What Scalar enables:
# - Partial clone (only download needed objects)
# - Filesystem monitor (FSMonitor for faster git status)
# - Commit-graph (faster git log and traversal)
# - Multi-pack index (faster object lookups)
# - Background maintenance (prefetch, gc, commit-graph updates)
# Clone a large repo with Scalar (partial clone + sparse checkout)
scalar clone https://github.com/contoso/platform-monorepo.git
cd platform-monorepo
# Verify Scalar configuration
scalar list
# Output: C:/repos/platform-monorepo
# Check what optimizations are active
git config --list | grep -E "(core.fsmonitor|core.multipackindex|fetch.writeCommitGraph|maintenance)"
# Output:
# core.fsmonitor=true
# core.multipackindex=true
# fetch.writecommitgraph=true
# maintenance.auto=false
# maintenance.strategy=incremental
# View scheduled maintenance tasks
scalar run
# Runs: prefetch, commit-graph, loose-objects, incremental-repack
# Manual Scalar commands
scalar diagnose # Generate diagnostic zip for troubleshooting
scalar cache-server --set https://cache.contoso.internal # Use a cache server
scalar unregister # Remove Scalar from this repo
Configure o Scalar para runners de CI:
# .github/workflows/ci-with-scalar.yml
name: CI with Scalar optimization
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Scalar clone (partial + sparse)
run: |
scalar clone https://github.com/contoso/platform-monorepo.git repo
cd repo
# Only fetch objects needed for the changed service
git sparse-checkout set services/order-service libs/shared-types
Tarefa 4: Configurar sparse-checkout para acesso a subconjunto do mono-repo
Permita que desenvolvedores trabalhem apenas no serviço de sua equipe dentro do mono-repo:
# Initialize sparse-checkout in cone mode (faster than pattern mode)
git sparse-checkout init --cone
# Only check out the order-service and shared libraries
git sparse-checkout set services/order-service libs/shared-types libs/common-utils
# View what's included
git sparse-checkout list
# Output:
# services/order-service
# libs/shared-types
# libs/common-utils
# The working directory now only shows those paths:
ls services/
# Output: order-service/
# Add another service temporarily (e.g., for cross-service debugging)
git sparse-checkout add services/payment-service
# Remove a path from sparse checkout
git sparse-checkout set services/order-service libs/shared-types
# (payment-service files disappear from working directory)
# Disable sparse-checkout (get everything back)
git sparse-checkout disable
# Combine with partial clone for maximum speed
git clone --filter=blob:none --sparse https://github.com/contoso/platform-monorepo.git
cd platform-monorepo
git sparse-checkout set services/user-service libs/auth-middleware
# Only downloads blobs for the sparse paths (not entire repo history)
Crie perfis de sparse-checkout específicos por equipe:
# scripts/sparse-profiles/commerce-team.sh
#!/bin/bash
git sparse-checkout set \
services/order-service \
services/payment-service \
services/inventory-service \
libs/shared-types \
libs/common-utils \
infrastructure/kubernetes/order-service \
infrastructure/kubernetes/payment-service
# scripts/sparse-profiles/frontend-team.sh
#!/bin/bash
git sparse-checkout set \
services/customer-portal \
services/admin-portal \
libs/shared-types \
libs/ui-components
# scripts/sparse-profiles/data-team.sh
#!/bin/bash
git sparse-checkout set \
services/analytics-service \
services/recommendation-service \
libs/shared-types \
libs/data-utils \
infrastructure/terraform/analytics
Tarefa 5: Git submodules para dependências entre repositórios
Configure submodules ao usar multi-repo para compartilhar bibliotecas comuns:
# Add shared-libs as a submodule in the order-service repo
cd order-service
git submodule add https://github.com/contoso/shared-libs.git libs/shared
git commit -m "chore: add shared-libs as submodule"
# The .gitmodules file tracks submodule configuration
cat .gitmodules
# [submodule "libs/shared"]
# path = libs/shared
# url = https://github.com/contoso/shared-libs.git
# branch = main
# Pin to a specific version/tag of shared-libs
cd libs/shared
git checkout v2.3.0
cd ..
git add libs/shared
git commit -m "chore: pin shared-libs to v2.3.0"
# Clone a repo with submodules
git clone --recurse-submodules https://github.com/contoso/order-service.git
# If already cloned without submodules, initialize them
git submodule init
git submodule update
# Update submodule to latest commit on its tracked branch
git submodule update --remote libs/shared
git add libs/shared
git commit -m "chore: update shared-libs to latest"
# Update all submodules
git submodule update --remote --merge
# Run a command in all submodules
git submodule foreach 'git checkout main && git pull'
# Remove a submodule
git submodule deinit libs/shared
git rm libs/shared
rm -rf .git/modules/libs/shared
git commit -m "chore: remove shared-libs submodule"
Tarefa 6: Checkout de múltiplos repositórios no Azure DevOps Pipelines
Configure o Azure Pipelines para fazer checkout de múltiplos repositórios:
# azure-pipelines.yml - Multi-repo checkout
trigger:
branches:
include:
- main
resources:
repositories:
- repository: shared-libs
type: git
name: Contoso-Platform/shared-libs
ref: refs/tags/v2.3.0
- repository: infrastructure
type: git
name: Contoso-Platform/infrastructure
ref: refs/heads/main
- repository: order-service
type: github
name: contoso/order-service
endpoint: github-service-connection
pool:
vmImage: 'ubuntu-latest'
steps:
# Check out the primary repo (self)
- checkout: self
path: s/payment-service
fetchDepth: 1
# Check out additional repos
- checkout: shared-libs
path: s/shared-libs
fetchDepth: 1
- checkout: infrastructure
path: s/infrastructure
fetchDepth: 1
- script: |
echo "Directory structure:"
ls -la $(Pipeline.Workspace)/s/
# Output:
# payment-service/
# shared-libs/
# infrastructure/
displayName: 'Verify multi-repo checkout'
- script: |
cd $(Pipeline.Workspace)/s/payment-service
npm ci
# Reference shared libs from adjacent checkout
npm link ../shared-libs
npm run build
npm test
displayName: 'Build with shared dependencies'
- script: |
cd $(Pipeline.Workspace)/s/infrastructure
terraform init
terraform plan -var-file=environments/prod.tfvars
displayName: 'Validate infrastructure'
Tarefa 7: Checkout de múltiplos repositórios no GitHub Actions
Configure o GitHub Actions para trabalhar com múltiplos repositórios:
# .github/workflows/ci-multi-repo.yml
name: CI with multi-repo dependencies
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
build:
runs-on: ubuntu-latest
steps:
# Check out the primary repo
- uses: actions/checkout@v4
with:
path: order-service
# Check out shared libraries (public repo)
- uses: actions/checkout@v4
with:
repository: contoso/shared-libs
ref: v2.3.0
path: shared-libs
# Check out private repo (requires PAT or GitHub App token)
- uses: actions/checkout@v4
with:
repository: contoso/infrastructure
token: ${{ secrets.CROSS_REPO_TOKEN }}
path: infrastructure
- name: Build with dependencies
working-directory: order-service
run: |
npm ci
# Create symlink to shared libs
ln -s ../shared-libs/packages/common ./node_modules/@contoso/common
npm run build
- name: Run integration tests
run: |
cd order-service
npm run test:integration -- --config ../infrastructure/test-config.json
Tarefa 8: Otimização de build com triggers por caminho
Construa e teste apenas os serviços que realmente mudaram:
# .github/workflows/ci-path-triggers.yml
name: Mono-repo path-based CI
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
detect-changes:
runs-on: ubuntu-latest
outputs:
order-service: ${{ steps.changes.outputs.order-service }}
payment-service: ${{ steps.changes.outputs.payment-service }}
shared-libs: ${{ steps.changes.outputs.shared-libs }}
user-service: ${{ steps.changes.outputs.user-service }}
steps:
- uses: actions/checkout@v4
- uses: dorny/paths-filter@v3
id: changes
with:
filters: |
order-service:
- 'services/order-service/**'
- 'libs/shared-types/**'
- 'libs/common-utils/**'
payment-service:
- 'services/payment-service/**'
- 'libs/shared-types/**'
shared-libs:
- 'libs/**'
user-service:
- 'services/user-service/**'
- 'libs/auth-middleware/**'
build-order-service:
needs: detect-changes
if: needs.detect-changes.outputs.order-service == 'true'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
sparse-checkout: |
services/order-service
libs/shared-types
libs/common-utils
- name: Build order-service
working-directory: services/order-service
run: |
npm ci
npm run build
npm test
build-payment-service:
needs: detect-changes
if: needs.detect-changes.outputs.payment-service == 'true'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
sparse-checkout: |
services/payment-service
libs/shared-types
- name: Build payment-service
working-directory: services/payment-service
run: |
npm ci
npm run build
npm test
# If shared libs change, rebuild ALL dependent services
build-all-on-shared-change:
needs: detect-changes
if: needs.detect-changes.outputs.shared-libs == 'true'
runs-on: ubuntu-latest
strategy:
matrix:
service:
- order-service
- payment-service
- user-service
- catalog-service
- shipping-service
steps:
- uses: actions/checkout@v4
- name: Build ${{ matrix.service }}
working-directory: services/${{ matrix.service }}
run: |
npm ci
npm run build
npm test
Equivalente no Azure Pipelines com triggers por caminho:
# azure-pipelines.yml - Path-based triggers
trigger:
branches:
include:
- main
paths:
include:
- services/order-service/**
- libs/shared-types/**
pool:
vmImage: 'ubuntu-latest'
steps:
- task: NodeTool@0
inputs:
versionSpec: '20.x'
- script: |
cd services/order-service
npm ci
npm run build
npm test
displayName: 'Build and test order-service'
Exercícios de quebra e conserto
Cenário 1: Sparse-checkout está sem arquivos necessários para o build
Um desenvolvedor configurou sparse-checkout apenas para services/order-service, mas o build falha porque importa de libs/shared-types que não foi incluído no checkout.
# Error during build:
npm run build
# ERROR: Cannot find module '@contoso/shared-types'
# Module not found: libs/shared-types/index.ts
# Check what's currently included
git sparse-checkout list
# Output: services/order-service (missing libs!)
Mostrar solução
Correção: Adicione os caminhos de dependência faltantes ao sparse-checkout:
# Add the shared library paths
git sparse-checkout add libs/shared-types libs/common-utils
# Verify the files are now available
ls libs/shared-types/
# Output: index.ts package.json src/ ...
# Re-run the build
cd services/order-service
npm run build
# Success!
# Document dependencies in a sparse profile for the team
cat > .sparse-profiles/order-service.txt << 'EOF'
services/order-service
libs/shared-types
libs/common-utils
infrastructure/kubernetes/order-service
EOF
Cenário 2: Submodule está preso em commit antigo após pull
Após executar git pull, o diretório do submodule ainda mostra a versão antiga mesmo que .gitmodules tenha sido atualizado.
# The submodule shows as modified but content is old
git status
# Output:
# modified: libs/shared (new commits)
git diff
# Shows submodule pointer changed but local copy is behind
# The submodule directory has the old code
cd libs/shared
git log --oneline -1
# abc1234 (HEAD) old commit from 2 weeks ago
Mostrar solução
Correção: Atualize o submodule para corresponder ao que o repositório pai espera:
# Update submodule to the commit specified by the parent
cd ..
git submodule update --init --recursive
# Verify it's now at the correct commit
cd libs/shared
git log --oneline -1
# def5678 (HEAD) latest pinned commit
# If you want to update to the latest on the tracked branch instead:
cd ..
git submodule update --remote libs/shared
git add libs/shared
git commit -m "chore: update shared-libs submodule to latest"
Verificação de conhecimento
1. : A Contoso tem um mono-repo com 15 microsserviços. Um desenvolvedor trabalha apenas no order-service e precisa clonar o repo rapidamente. Qual combinação de recursos do Git fornece o clone mais rápido com uso mínimo de disco?
2. : O que o comando 'scalar register' do Scalar habilita para um repositório Git?
3. : Em uma configuração multi-repo, a equipe A atualiza 'shared-libs' v2.3.0 para v2.4.0 com uma breaking change. Qual é o principal desafio que isso cria?
4. : Um arquivo YAML do Azure Pipelines usa 'trigger.paths.include' para construir apenas quando caminhos específicos mudam. Um desenvolvedor modifica 'libs/shared-types/index.ts'. Qual comportamento do pipeline está correto?
Limpeza
# Remove Scalar registration
scalar unregister 2>/dev/null
# Reset sparse-checkout
git sparse-checkout disable 2>/dev/null
# Remove submodules added during testing
git submodule deinit --all -f 2>/dev/null
rm -rf .git/modules/* 2>/dev/null
# Remove test directories and files
rm -rf services/ libs/ infrastructure/ tools/ 2>/dev/null
rm -f .gitmodules nx.json turborepo.json 2>/dev/null
rm -rf .sparse-profiles/ 2>/dev/null
# Remove workflow files created during this challenge
rm -f .github/workflows/ci-path-triggers.yml
rm -f .github/workflows/ci-multi-repo.yml
rm -f .github/workflows/ci-with-scalar.yml
# Clean up any partial clone filter config
git config --unset remote.origin.promisor 2>/dev/null
git config --unset remote.origin.partialclonefilter 2>/dev/null
# Verify clean state
git status
git config --list | grep -E "(scalar|sparse|fsmonitor|multipack)"