Computer Vision Solutions
This domain covers building applications that can see and understand images, documents, and video using Azure AI Vision services. It represents 10–15% of the AI-102 exam.
You'll work with the Image Analysis 4.0 API for general-purpose vision tasks, Custom Vision for domain-specific classification and detection, the Read API for OCR, Face API for facial recognition scenarios, and Video Indexer for video content analysis. Each challenge focuses on a specific capability with real-world scenarios.
The exam tests both SDK usage patterns and understanding of when to use each vision service. Know the difference between pre-built models (Image Analysis) and custom-trained models (Custom Vision), and when OCR vs Document Intelligence is the right choice.
What You'll Learn
- Analyze images with Azure AI Vision (captions, tags, objects, people)
- Train and deploy custom image classification models
- Implement custom object detection with bounding boxes
- Extract text from images with OCR (Read API)
- Detect and analyze faces (with responsible AI considerations)
- Analyze video content with Video Indexer
Skills Measured
- Analyze images using Azure AI Vision Image Analysis
- Implement custom image classification and object detection
- Read and extract text from images and documents
- Implement face detection, verification, and identification
- Analyze video with Azure Video Indexer
Challenges
| # | Title | Key Topics |
|---|---|---|
| 24 | Image Analysis with AI Vision | Captions, tags, objects, smart crops |
| 25 | Advanced Image Analysis | Dense captions, people detection, background removal |
| 26 | Custom Image Classification | Training data, iterations, publishing, prediction |
| 27 | Custom Object Detection | Bounding boxes, region tagging, evaluation metrics |
| 28 | OCR with Read API | Printed/handwritten text, multi-language, async operation |
| 29 | Face Detection & Analysis | Face attributes, verification, identification, liveness |
| 30 | Video Analysis with Video Indexer | Scenes, transcripts, faces in video, insights |
Prerequisites
- Completed Domain 1 (Plan & Manage) or equivalent knowledge
- Azure AI Services multi-service resource provisioned
- Python or C# programming skills
- Basic understanding of image formats and HTTP file uploads