Challenge 31: Text Analytics - Key Phrases, Entities, Sentiment

Estimated Time

45 min | Cost: $1-3 (estimated) | Domain: Implement NLP Solutions (15-20%)

Exam skills covered

Extract key phrases from text
Recognize named entities and linked entities
Determine sentiment with opinion mining
Detect language

Overview

Azure AI Language (Text Analytics) provides NLP capabilities:

Feature	Description
Sentiment Analysis	Positive/neutral/negative with confidence + opinion mining
Key Phrase Extraction	Identify main talking points
Named Entity Recognition (NER)	Detect entities (Person, Location, Organization, DateTime, etc.)
Entity Linking	Link entities to Wikipedia knowledge base
Language Detection	Identify language of text

The client supports batch operations — send multiple documents in one request for efficiency.

Prerequisites

Azure subscription
Azure AI Language resource (or multi-service)
Python 3.9+ or .NET 8
Package: azure-ai-textanalytics (v5.3+)

Implementation

Task 1: Create Language Resource

az group create --name rg-ai102-nlp --location eastus2

az cognitiveservices account create \
  --name language-ai102 \
  --resource-group rg-ai102-nlp \
  --kind TextAnalytics \
  --sku S \
  --location eastus2

ENDPOINT=$(az cognitiveservices account show --name language-ai102 --resource-group rg-ai102-nlp --query properties.endpoint -o tsv)
KEY=$(az cognitiveservices account keys list --name language-ai102 --resource-group rg-ai102-nlp --query key1 -o tsv)

Task 2: Analyze Sentiment with Opinion Mining

Python SDK
C# SDK
REST API

import os
from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential

client = TextAnalyticsClient(
    endpoint=os.environ["AZURE_AI_ENDPOINT"],
    credential=AzureKeyCredential(os.environ["AZURE_AI_KEY"])
)

documents = [
    "The hotel room was clean and spacious, but the service was slow and unfriendly.",
    "I absolutely love this product! Fast delivery and excellent quality.",
    "The meeting was scheduled for 3 PM."
]

# Sentiment analysis with opinion mining
results = client.analyze_sentiment(
    documents,
    show_opinion_mining=True,
    language="en"
)

for idx, result in enumerate(results):
    if result.is_error:
        print(f"Doc {idx}: Error - {result.error.message}")
        continue
    
    print(f"Document {idx}: '{documents[idx][:50]}...'")
    print(f"  Overall: {result.sentiment} "
          f"(pos={result.confidence_scores.positive:.3f}, "
          f"neu={result.confidence_scores.neutral:.3f}, "
          f"neg={result.confidence_scores.negative:.3f})")
    
    for sentence in result.sentences:
        print(f"  Sentence: '{sentence.text[:40]}...' → {sentence.sentiment}")
        
        # Opinion mining - aspect-based sentiment
        for mined_opinion in sentence.mined_opinions:
            target = mined_opinion.target
            print(f"    Target: '{target.text}' ({target.sentiment})")
            for assessment in mined_opinion.assessments:
                print(f"      Assessment: '{assessment.text}' ({assessment.sentiment})")
    print()

using Azure;
using Azure.AI.TextAnalytics;

var client = new TextAnalyticsClient(
    new Uri(Environment.GetEnvironmentVariable("AZURE_AI_ENDPOINT")),
    new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_AI_KEY")));

var documents = new List<string>
{
    "The hotel room was clean and spacious, but the service was slow.",
    "I absolutely love this product! Fast delivery and excellent quality."
};

var options = new AnalyzeSentimentOptions { IncludeOpinionMining = true };
var results = client.AnalyzeSentimentBatch(documents, "en", options);

foreach (var result in results)
{
    Console.WriteLine($"Sentiment: {result.DocumentSentiment.Sentiment}");
    foreach (var sentence in result.DocumentSentiment.Sentences)
    {
        Console.WriteLine($"  '{sentence.Text}' -> {sentence.Sentiment}");
        foreach (var opinion in sentence.Opinions)
        {
            Console.WriteLine($"    Target: '{opinion.Target.Text}' ({opinion.Target.Sentiment})");
            foreach (var assessment in opinion.Assessments)
                Console.WriteLine($"      '{assessment.Text}' ({assessment.Sentiment})");
        }
    }
}

ENDPOINT="https://<resource>.cognitiveservices.azure.com"
KEY="<your-key>"

curl -s "${ENDPOINT}/language/:analyze-text?api-version=2023-04-01" \
  -H "Ocp-Apim-Subscription-Key: ${KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "kind": "SentimentAnalysis",
    "parameters": {"opinionMining": true},
    "analysisInput": {
      "documents": [
        {"id": "1", "language": "en", "text": "The hotel was great but the food was terrible."}
      ]
    }
  }' | jq '.results.documents[0]'

Task 3: Extract Key Phrases and Named Entities

Python SDK
REST API

documents = [
    "Microsoft CEO Satya Nadella announced Azure AI updates at the Build 2024 conference in Seattle on May 21.",
    "The quarterly revenue increased by 15% to $62 billion, driven by cloud services growth."
]

# Key phrase extraction
key_phrases_results = client.extract_key_phrases(documents, language="en")
print("=== KEY PHRASES ===")
for idx, result in enumerate(key_phrases_results):
    if not result.is_error:
        print(f"Doc {idx}: {result.key_phrases}")

# Named Entity Recognition
ner_results = client.recognize_entities(documents, language="en")
print("\n=== NAMED ENTITIES ===")
for idx, result in enumerate(ner_results):
    if not result.is_error:
        print(f"Doc {idx}:")
        for entity in result.entities:
            print(f"  '{entity.text}' → {entity.category}"
                  f"{f'/{entity.subcategory}' if entity.subcategory else ''}"
                  f" (confidence: {entity.confidence_score:.3f})")

# Entity Linking (to Wikipedia)
linked_results = client.recognize_linked_entities(documents, language="en")
print("\n=== LINKED ENTITIES ===")
for idx, result in enumerate(linked_results):
    if not result.is_error:
        for entity in result.entities:
            print(f"  '{entity.name}' → {entity.url}")
            print(f"    Data source: {entity.data_source}, ID: {entity.data_source_entity_id}")

# Key phrases
curl -s "${ENDPOINT}/language/:analyze-text?api-version=2023-04-01" \
  -H "Ocp-Apim-Subscription-Key: ${KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "kind": "KeyPhraseExtraction",
    "analysisInput": {
      "documents": [{"id": "1", "language": "en", "text": "Microsoft announced Azure AI updates at Build 2024 in Seattle."}]
    }
  }' | jq '.results.documents[0].keyPhrases'

# Named entities
curl -s "${ENDPOINT}/language/:analyze-text?api-version=2023-04-01" \
  -H "Ocp-Apim-Subscription-Key: ${KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "kind": "EntityRecognition",
    "analysisInput": {
      "documents": [{"id": "1", "language": "en", "text": "Microsoft CEO Satya Nadella announced Azure AI updates at Build 2024 in Seattle on May 21."}]
    }
  }' | jq '.results.documents[0].entities[] | {text, category, confidenceScore}'

Task 4: Language Detection

Python SDK

# Language detection
multilingual_docs = [
    "Hello, how are you today?",
    "Bonjour, comment allez-vous?",
    "こんにちは、元気ですか？",
    "Hola, ¿cómo estás?"
]

lang_results = client.detect_language(multilingual_docs)
print("=== LANGUAGE DETECTION ===")
for idx, result in enumerate(lang_results):
    if not result.is_error:
        lang = result.primary_language
        print(f"  '{multilingual_docs[idx][:30]}...' → {lang.name} ({lang.iso6391_name}) "
              f"confidence: {lang.confidence_score:.3f}")

Expected Output

Document 0: 'The hotel room was clean and spacious, but the s...'
  Overall: mixed (pos=0.450, neu=0.100, neg=0.450)
  Sentence: 'The hotel room was clean and sp...' → mixed
    Target: 'room' (positive)
      Assessment: 'clean' (positive)
      Assessment: 'spacious' (positive)
    Target: 'service' (negative)
      Assessment: 'slow' (negative)
      Assessment: 'unfriendly' (negative)

=== KEY PHRASES ===
Doc 0: ['Microsoft CEO Satya Nadella', 'Azure AI updates', 'Build 2024 conference', 'Seattle']

=== NAMED ENTITIES ===
Doc 0:
  'Microsoft' → Organization (confidence: 0.990)
  'Satya Nadella' → Person (confidence: 0.985)
  'Azure AI' → Product (confidence: 0.920)
  'Build 2024' → Event (confidence: 0.880)
  'Seattle' → Location (confidence: 0.995)
  'May 21' → DateTime/Date (confidence: 0.970)

=== LANGUAGE DETECTION ===
  'Hello, how are you today?...' → English (en) confidence: 1.000
  'Bonjour, comment allez-vous?...' → French (fr) confidence: 1.000
  'こんにちは、元気ですか？...' → Japanese (ja) confidence: 1.000
  'Hola, ¿cómo estás?...' → Spanish (es) confidence: 1.000

Break & fix

Scenario	Symptom	Root Cause	Fix
Mixed results on clear text	Unexpected `mixed` sentiment	Opinion mining detects opposing opinions	Use sentence-level sentiment for granularity
Empty key phrases	No phrases returned	Text too short or generic	Provide substantive text (10+ words recommended)
Entity category `Unknown`	Unrecognized entities	Domain-specific terms not in model	Use custom NER model for specialized entities
Batch error on one doc	`InvalidDocument` in results	Document exceeds 5,120 characters	Split long documents; check `is_error` per document
Wrong language detection	Incorrect language	Mixed-language text confuses detection	Separate text by language; use longer samples

Knowledge Check

1. What does opinion mining add to standard sentiment analysis?

2. What is the maximum document size for a single text analytics request?

3. What is the difference between Named Entity Recognition (NER) and Entity Linking?

4. How should you handle errors in batch text analytics results?

5. What confidence score format does language detection return?

Cleanup

az group delete --name rg-ai102-nlp --yes --no-wait

Exam skills covered​

Overview​

Prerequisites​

Implementation​

Task 1: Create Language Resource​

Task 2: Analyze Sentiment with Opinion Mining​

Task 3: Extract Key Phrases and Named Entities​

Task 4: Language Detection​

Expected Output​

Break & fix​

Knowledge Check​

Cleanup​

Learn More​