Challenge 10: Image Classification
25-35 min | Cost: Free | Domain: Computer Vision on Azure (15-20%)
Exam skills covered
- Identify features of image classification solutions
- Describe single-label and multi-label image classification
- Understand confidence scores in classification results
- Identify Azure services for image classification
Overview
Image classification is a computer vision technique that answers the question: "What is in this image?" Given an image, the model assigns one or more category labels with confidence scores. It's like showing a photo to someone and asking "what is this?" — except the AI responds with probabilities.
Think of image classification like a nature guide identifying birds. You show them a photo, and they say "I'm 95% sure that's a cardinal, 3% blue jay, 2% robin." They've learned to recognize hundreds of species from thousands of examples. Similarly, an image classification model learns from labeled training images to categorize new images it has never seen.
There are two types: single-label classification assigns exactly one category (this is EITHER a cat OR a dog), while multi-label classification can assign multiple categories (this image contains BOTH a beach AND a sunset AND people).
Explore
Task 1: Understand image classification types
| Type | Output | Example |
|---|---|---|
| Single-label | One category per image | "This is a cat" (not a dog, not a bird) |
| Multi-label | Multiple categories per image | "This contains: outdoor, beach, people, sunset" |
Confidence scores: Every prediction comes with a probability (0.0 to 1.0):
- 0.95 = 95% confident → very reliable
- 0.60 = 60% confident → uncertain, might need human review
- Threshold: Applications typically only accept predictions above a certain confidence (e.g., > 0.7)
Task 2: Try Azure AI Vision image analysis
- Visit Azure AI Vision demo
- Select or upload a sample image
- Observe the results:
- Tags — categories/labels assigned to the image
- Confidence scores — how certain the model is for each tag
- Notice that multiple tags can be returned (multi-label)
- Try different types of images (landscapes, animals, food, objects) and observe how tags change
Task 3: Custom Vision vs pre-built Vision
Azure offers two approaches to image classification:
| Approach | When to use | How it works |
|---|---|---|
| Azure AI Vision (pre-built) | General image understanding | Pre-trained on millions of images; works immediately for common objects/scenes |
| Custom Vision | Domain-specific classification | You train with YOUR images and YOUR categories (e.g., "defective" vs "good" products on your assembly line) |
Custom Vision workflow:
- Upload labeled training images (at least 15 per category recommended)
- Train the model (Custom Vision handles the ML)
- Test with new images
- Deploy and use via API
Task 4: Image classification in the real world
| Industry | Use case | Classification type |
|---|---|---|
| Manufacturing | Defect detection (good/defective parts) | Single-label binary |
| Retail | Product categorization from photos | Multi-class single-label |
| Healthcare | Skin lesion classification (benign/malignant) | Single-label binary |
| Agriculture | Crop disease identification | Multi-class single-label |
| Social media | Content moderation (appropriate/inappropriate) | Single-label binary |
| Photography | Auto-tagging photos (beach, people, sunset...) | Multi-label |
The exam distinguishes between:
- Image classification: "What is this?" → assigns label(s) to the whole image
- Object detection: "What and WHERE?" → finds objects with bounding boxes
- OCR: "What text is here?" → extracts text from images
Know which is which!
Key Concepts
| Concept | Definition |
|---|---|
| Image classification | Assigning category labels to an entire image |
| Single-label classification | Each image gets exactly one category (mutually exclusive classes) |
| Multi-label classification | Each image can get multiple categories (non-exclusive tags) |
| Confidence score | Probability (0-1) indicating how certain the model is about a prediction |
| Training images | Labeled examples used to teach the model what each category looks like |
| Custom Vision | Azure service for training custom image classification models with your own data |
| Azure AI Vision | Pre-built service for general image analysis (tagging, description, categorization) |
| Threshold | Minimum confidence score required to accept a prediction |
Common Misconceptions
| Misconception | Reality |
|---|---|
| "Image classification tells you WHERE objects are in the image" | Classification only tells you WHAT is in the image (the whole image). Object detection tells you WHERE (with bounding boxes). These are different tasks |
| "You need thousands of images to train a custom classifier" | Azure Custom Vision can work with as few as 15 images per category for basic classification. More images improve accuracy, but you can start small |
| "A 90% confidence score means the model is 90% accurate" | Confidence is per-prediction — it means the model is 90% sure about THIS specific image. Overall model accuracy is measured separately across many test images |
| "Pre-built Azure AI Vision can classify anything" | Pre-built models handle common objects and scenes. For domain-specific categories (your product types, specific defects), you need Custom Vision with your own training data |
| "Multi-label means the model is uncertain" | Multi-label means the image legitimately contains multiple things. An image with a dog on a beach correctly gets both "dog" and "beach" tags — this isn't uncertainty |
Knowledge Check
1. A photo-sharing app needs to automatically tag uploaded photos with relevant labels like "outdoor", "food", "people", and "sunset" — an image can have multiple tags. What type of classification is this?
2. An image classification model returns a confidence score of 0.45 for "cat" and 0.42 for "dog". What should the application do?
3. A manufacturing company needs to classify products on their assembly line as "pass" or "fail" based on photos. The categories are specific to their products. Which Azure service is most appropriate?
4. What is the minimum number of training images recommended per category when using Azure Custom Vision?
5. What is the key difference between image classification and object detection?