Pular para o conteúdo principal

Desafio 37: Conversational Language Understanding (CLU)

Tempo Estimado

60 min | Custo: $2-5 (estimado) | Domínio: Implementar Soluções de NLP (15-20%)

Habilidades do exame abordadas

  • Criar intents e entidades para compreensão de linguagem conversacional
  • Adicionar utterances para treinar o modelo
  • Treinar, avaliar e implantar um modelo CLU
  • Consultar um modelo CLU implantado

Visão Geral

Conversational Language Understanding (CLU) é o substituto do LUIS (Language Understanding). Ele classifica utterances do usuário em intents e extrai entidades:

ConceitoDescrição
IntentO objetivo do usuário (ex.: "BookFlight", "GetWeather")
EntityInformação-chave extraída (ex.: destino, data)
UtteranceTexto de exemplo mapeado para intents/entidades para treinamento

Tipos de entidade:

  • Learned — Aprendida por machine learning a partir de exemplos rotulados
  • List — Conjunto definido de valores com sinônimos
  • Prebuilt — Tipos pré-treinados (datetime, number, temperature, etc.)

CLU usa o endpoint do serviço Language em: https://{endpoint}.cognitiveservices.azure.com/language/

Pré-requisitos

  • Assinatura do Azure
  • Recurso Azure AI Language
  • Python 3.9+ com requests e azure-ai-language-conversations
  • Dados de treinamento (utterances com rótulos de intent/entidade)

Implementação

Tarefa 1: Criar Projeto CLU via REST

ENDPOINT="https://<resource>.cognitiveservices.azure.com"
KEY="<your-key>"
PROJECT_NAME="travel-assistant"
API_VERSION="2023-04-01"

# Create project
curl -s "${ENDPOINT}/language/authoring/analyze-conversations/projects/${PROJECT_NAME}?api-version=${API_VERSION}" \
-X PATCH \
-H "Ocp-Apim-Subscription-Key: ${KEY}" \
-H "Content-Type: application/json" \
-d '{
"projectKind": "Conversation",
"projectName": "travel-assistant",
"language": "en",
"description": "Travel booking assistant for AI-102 lab",
"multilingual": false
}' | jq .

Tarefa 2: Definir Intents, Entidades e Utterances

# Import training data as a batch (assets)
import_url = f"{endpoint}/language/authoring/analyze-conversations/projects/{project_name}/:import?api-version={api_version}"

training_data = {
"projectFileVersion": "2022-10-01-preview",
"stringIndexType": "Utf16CodeUnit",
"metadata": {
"projectKind": "Conversation",
"projectName": project_name,
"multilingual": False,
"language": "en"
},
"assets": {
"projectKind": "Conversation",
"intents": [
{"category": "BookFlight"},
{"category": "GetWeather"},
{"category": "Cancel"},
{"category": "None"}
],
"entities": [
{
"category": "Destination",
"compositionSetting": "combineComponents",
"list": None,
"prebuilts": None
},
{
"category": "DepartureDate",
"compositionSetting": "combineComponents",
"list": None,
"prebuilts": [{"category": "DateTime"}]
},
{
"category": "TravelClass",
"compositionSetting": "combineComponents",
"list": {
"sublists": [
{"listKey": "economy", "synonyms": [{"language": "en", "values": ["economy", "coach", "standard"]}]},
{"listKey": "business", "synonyms": [{"language": "en", "values": ["business", "business class", "premium"]}]},
{"listKey": "first", "synonyms": [{"language": "en", "values": ["first", "first class", "luxury"]}]}
]
}
}
],
"utterances": [
{
"text": "Book a flight to London next Friday",
"intent": "BookFlight",
"language": "en",
"entities": [
{"category": "Destination", "offset": 17, "length": 6},
{"category": "DepartureDate", "offset": 24, "length": 11}
]
},
{
"text": "I need a business class ticket to Tokyo",
"intent": "BookFlight",
"language": "en",
"entities": [
{"category": "TravelClass", "offset": 9, "length": 14},
{"category": "Destination", "offset": 34, "length": 5}
]
},
{
"text": "What is the weather like in Paris tomorrow",
"intent": "GetWeather",
"language": "en",
"entities": [
{"category": "Destination", "offset": 28, "length": 5},
{"category": "DepartureDate", "offset": 34, "length": 8}
]
},
{
"text": "Cancel my booking",
"intent": "Cancel",
"language": "en",
"entities": []
},
{
"text": "Never mind, cancel that",
"intent": "Cancel",
"language": "en",
"entities": []
},
{
"text": "Hello there",
"intent": "None",
"language": "en",
"entities": []
}
]
}
}

response = requests.post(import_url, headers=headers, json=training_data)
operation_url = response.headers.get("operation-location")
print(f"Import started: {response.status_code}")

# Poll until import completes
while True:
status_response = requests.get(operation_url, headers=headers)
status = status_response.json()["status"]
print(f" Import status: {status}")
if status in ["succeeded", "failed"]:
break
time.sleep(2)

Tarefa 3: Treinar e Implantar Modelo

# Train the model
train_url = f"{endpoint}/language/authoring/analyze-conversations/projects/{project_name}/:train?api-version={api_version}"

train_body = {
"modelLabel": "model-v1",
"trainingMode": "standard",
"trainingConfigVersion": "2022-09-01",
"evaluationOptions": {
"kind": "percentage",
"testingSplitPercentage": 20,
"trainingSplitPercentage": 80
}
}

response = requests.post(train_url, headers=headers, json=train_body)
operation_url = response.headers.get("operation-location")
print(f"Training started: {response.status_code}")

# Poll training status
while True:
status_response = requests.get(operation_url, headers=headers)
result = status_response.json()
print(f" Training status: {result['status']}")
if result["status"] in ["succeeded", "failed"]:
break
time.sleep(10)

# Deploy the model
deploy_url = f"{endpoint}/language/authoring/analyze-conversations/projects/{project_name}/deployments/production?api-version={api_version}"

deploy_body = {"trainedModelLabel": "model-v1"}

response = requests.put(deploy_url, headers=headers, json=deploy_body)
operation_url = response.headers.get("operation-location")
print(f"Deployment started: {response.status_code}")

while True:
status_response = requests.get(operation_url, headers=headers)
result = status_response.json()
print(f" Deploy status: {result['status']}")
if result["status"] in ["succeeded", "failed"]:
break
time.sleep(5)

print("Model deployed to 'production' slot!")

Tarefa 4: Consultar o Modelo Implantado

from azure.ai.language.conversations import ConversationAnalysisClient
from azure.core.credentials import AzureKeyCredential

# Query using the SDK
client = ConversationAnalysisClient(
endpoint=os.environ["AZURE_AI_ENDPOINT"],
credential=AzureKeyCredential(os.environ["AZURE_AI_KEY"])
)

# Test queries
test_queries = [
"I want to fly to New York next Monday in first class",
"What's the weather forecast for Seattle?",
"Cancel my reservation please"
]

for query in test_queries:
result = client.analyze_conversation(
task={
"kind": "Conversation",
"analysisInput": {
"conversationItem": {
"id": "1",
"participantId": "user1",
"text": query
}
},
"parameters": {
"projectName": project_name,
"deploymentName": "production"
}
}
)

prediction = result["result"]["prediction"]
top_intent = prediction["topIntent"]
confidence = prediction["intents"][0]["confidenceScore"]

print(f"\nQuery: '{query}'")
print(f" Intent: {top_intent} (confidence: {confidence:.4f})")
print(f" Entities:")
for entity in prediction.get("entities", []):
print(f" [{entity['category']}] '{entity['text']}' "
f"(confidence: {entity['confidenceScore']:.3f})")

Saída Esperada

Project created: 201
Import started: 202
Import status: running
Import status: succeeded
Training started: 202
Training status: running
Training status: succeeded
Deployment started: 202
Deploy status: running
Deploy status: succeeded
Model deployed to 'production' slot!

Query: 'I want to fly to New York next Monday in first class'
Intent: BookFlight (confidence: 0.9723)
Entities:
[Destination] 'New York' (confidence: 0.945)
[DepartureDate] 'next Monday' (confidence: 0.988)
[TravelClass] 'first class' (confidence: 0.992)

Query: 'What's the weather forecast for Seattle?'
Intent: GetWeather (confidence: 0.9456)
Entities:
[Destination] 'Seattle' (confidence: 0.967)

Query: 'Cancel my reservation please'
Intent: Cancel (confidence: 0.9812)
Entities:

Quebra & conserta

CenárioSintomaCausa RaizCorreção
Baixa confiança de intentIntent errado previstoPoucas utterances de treinamento por intentAdicione 10-15+ utterances diversas por intent
Entidade não extraídaEntidades ausentes na respostaOffset/length de entidade incorreto no treinamentoVerifique se os offsets de caractere correspondem às posições exatas do texto
Treinamento falhaErros de validaçãoUtterances duplicadas ou spans de entidade inválidosVerifique dados de treinamento para entidades sobrepostas e duplicatas
Deploy falha409 ConflictNome de deployment já existe com modelo diferenteDelete o deployment existente ou use nome diferente
Intent None corresponde a tudoDisparo excessivoIntent None tem exemplos muito similares a outros intentsFaça exemplos do intent None claramente não relacionados a todos os outros intents

Verificação de Conhecimento

1. Qual é a relação entre intents e utterances no CLU?

2. Quais são os três tipos de entidade no CLU?

3. O que substituiu o LUIS (Language Understanding) no Azure AI?

4. Qual é o propósito do intent 'None'?

5. Como você especifica posições de caractere para rótulos de entidade em utterances de treinamento?

Limpeza

az group delete --name rg-ai102-nlp --yes --no-wait

Saiba Mais