Challenge 06: Classification in Machine Learning

Estimated Time

25-35 min | Cost: Free | Domain: Machine Learning on Azure (15-20%)

Exam skills covered

Identify classification machine learning scenarios
Describe binary classification vs multi-class classification
Understand training and evaluation of classification models
Identify appropriate evaluation metrics for classification

Overview

Classification is the machine learning technique used to predict a category (also called a class or label). Whenever the answer to your question is "which group does this belong to?" — you're looking at a classification problem. Spam/not-spam, disease/healthy, cat/dog/bird — these are all classification.

Think of classification like a mail sorter at the post office. Letters arrive, and the sorter puts each one into the correct bin based on features (zip code, size, weight). The sorter learned the rules by seeing thousands of previously sorted letters (training data). Now it can classify new letters it has never seen before.

There are two types: binary classification has exactly two possible outcomes (yes/no, true/false, spam/not-spam). Multi-class classification has three or more possible outcomes (cat/dog/bird, or categorizing support tickets into billing/technical/shipping/other).

Explore

Task 1: Binary vs multi-class classification

Type	Number of classes	Examples
Binary	Exactly 2	Spam or not spam, fraud or legitimate, pass or fail, positive or negative sentiment
Multi-class	3 or more	Animal species (cat/dog/bird/fish), product category, language detection, disease type

Key rule: If the output is one of TWO possible categories → binary. If THREE or more → multi-class.

Task 2: Identify classification scenarios

Scenario	Type	Why
Is this credit card transaction fraudulent?	Binary	Two outcomes: fraud / not fraud
What language is this text written in?	Multi-class	Many possible languages
Will this customer churn (leave)?	Binary	Two outcomes: yes / no
What type of iris flower is this?	Multi-class	Three species: setosa, versicolor, virginica
Does this X-ray show pneumonia?	Binary	Two outcomes: pneumonia / normal
Which department should handle this ticket?	Multi-class	Multiple departments (billing, tech, shipping...)

Task 3: Explore Automated ML for classification

Azure Machine Learning's Automated ML can build classification models with minimal effort:

Visit Azure Machine Learning Studio
Review the Automated ML concept:
- You provide a labeled dataset (features + known categories)
- Automated ML tries multiple algorithms and settings
- It returns the best-performing model automatically
For exam purposes, understand these Automated ML capabilities:
- Data guardrails: Automatically checks for data quality issues
- Algorithm selection: Tests multiple algorithms (logistic regression, decision trees, etc.)
- Hyperparameter tuning: Optimizes model settings automatically
- Feature engineering: Can create new features from existing data

Task 4: Understand classification evaluation metrics

Metric	What it measures	Simple explanation
Accuracy	Overall correctness	"What % of predictions were correct?"
Precision	Quality of positive predictions	"When it says 'spam', how often is it right?"
Recall	Completeness of positive detection	"Of all actual spam, what % did it catch?"
F1 Score	Balance of precision and recall	Harmonic mean — useful when classes are imbalanced
AUC	Model's ability to distinguish classes	1.0 = perfect, 0.5 = random guessing

Example: A spam filter with high precision but low recall means: when it flags something as spam, it's usually right — but it misses a lot of actual spam.

Exam insight

The exam loves questions about when accuracy alone is misleading. If 99% of emails are legitimate and 1% are spam, a model that says "not spam" for everything has 99% accuracy but catches ZERO spam. This is why precision, recall, and AUC matter.

Key Concepts

Concept	Definition
Classification	ML technique that predicts which category/class an item belongs to
Binary classification	Classification with exactly two possible outcomes
Multi-class classification	Classification with three or more possible outcomes
Logistic regression	Common algorithm for binary classification (despite the name, it classifies)
Confusion matrix	Table showing true positives, true negatives, false positives, and false negatives
Precision	Of all items predicted as positive, what percentage actually are positive
Recall (Sensitivity)	Of all actual positive items, what percentage did the model correctly identify
AUC (Area Under Curve)	Measures how well the model separates the classes (0.5 to 1.0)

Common Misconceptions

Misconception	Reality
"Classification and regression are interchangeable"	Classification predicts categories (spam/not-spam). Regression predicts numbers ($500, 73 degrees). The output type determines which technique to use
"Binary classification can only output 'yes' or 'no'"	Binary means two classes, but they can be anything: spam/ham, malignant/benign, approved/denied. It's always exactly two outcomes
"Logistic regression is a regression technique"	Despite its name, logistic regression is used for classification. It outputs a probability (0 to 1) which is then converted to a class label
"Higher accuracy always means a better model"	With imbalanced datasets, accuracy is misleading. A model predicting the majority class always can have high accuracy but zero usefulness for detecting the minority class
"You need thousands of examples to classify"	While more data generally helps, the required amount depends on the problem complexity. Some problems work well with hundreds of examples per class

Knowledge Check

1. A hospital wants to predict whether a tumor is malignant or benign based on cell measurements. What type of machine learning problem is this?

2. An image recognition system needs to identify whether a photo contains a cat, dog, bird, or fish. What type of classification is this?

3. A spam detection model has high precision but low recall. What does this mean in practice?

4. Which Azure Machine Learning capability automatically tries multiple algorithms and selects the best classification model?

5. What is the key difference between a classification problem and a regression problem?

Exam skills covered​

Overview​

Explore​

Task 1: Binary vs multi-class classification​

Task 2: Identify classification scenarios​

Task 3: Explore Automated ML for classification​

Task 4: Understand classification evaluation metrics​

Key Concepts​

Common Misconceptions​

Knowledge Check​

Learn More​