Challenge 42: Search Queries — Syntax and Filters

Estimated Time

45-60 min | Cost: ~$0.10 (queries against existing index) | Domain: Knowledge Mining & Extraction (15-20%)

Exam skills covered

Skill	Weight
Query an index using simple syntax	High
Query an index using full Lucene syntax	High
Apply filters with OData expressions	High
Implement sorting, paging, and field selection	Medium
Implement faceted navigation	Medium
Use wildcards and fuzzy search	Medium

Overview

Azure AI Search supports two query parsers:

Parser	Syntax	Use case
Simple (default)	`+term -term "phrase" *suffix`	User-facing search boxes
Full Lucene	`field:term~2 /regex/ term^boost`	Advanced developer queries

Key query parameters:

search: The search text (simple or Lucene syntax)
$filter: OData filter expression for exact matching
$orderby: Sort results
$select: Choose which fields to return
$top / $skip: Pagination
$count: Include total count in response
facets: Aggregate field values for navigation

Prerequisites

Completed Challenge 40 (index with enriched documents)
Python 3.9+ with azure-search-documents>=11.4.0
.NET 8 with Azure.Search.Documents
At least 10+ documents indexed for meaningful results

Implementation

Task 1: Simple query syntax

Python SDK
C# SDK
REST API

from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient

endpoint = f"https://{SEARCH_SERVICE}.search.windows.net"
credential = AzureKeyCredential(SEARCH_KEY)
search_client = SearchClient(endpoint=endpoint, index_name="documents-index", credential=credential)

# Simple search — finds documents containing "Azure" AND "cognitive"
results = search_client.search(
    search_text="Azure cognitive",
    include_total_count=True,
    top=5
)

print(f"Total matching documents: {results.get_count()}")
for result in results:
    print(f"  Score: {result['@search.score']:.4f} | {result['metadata_storage_name']}")

# Phrase search — exact phrase match
results = search_client.search(search_text='"Azure AI services"')
for result in results:
    print(f"  Phrase match: {result['metadata_storage_name']}")

# Boolean operators in simple syntax (+ required, - excluded, | OR)
results = search_client.search(search_text="+Azure -deprecated | cognitive")

using Azure.Search.Documents;
using Azure.Search.Documents.Models;

var searchClient = new SearchClient(
    new Uri($"https://{searchService}.search.windows.net"),
    "documents-index",
    new AzureKeyCredential(searchKey));

// Simple search
var options = new SearchOptions
{
    IncludeTotalCount = true,
    Size = 5
};

var results = await searchClient.SearchAsync<SearchDocument>("Azure cognitive", options);
Console.WriteLine($"Total: {results.Value.TotalCount}");
await foreach (var result in results.Value.GetResultsAsync())
{
    Console.WriteLine($"  Score: {result.Score:F4} | {result.Document["metadata_storage_name"]}");
}

// Phrase search
var phraseResults = await searchClient.SearchAsync<SearchDocument>("\"Azure AI services\"");

# Simple search
curl -s "https://${SEARCH_SERVICE}.search.windows.net/indexes/documents-index/docs?api-version=2024-07-01&search=Azure+cognitive&\$count=true&\$top=5" \
  -H "api-key: ${SEARCH_KEY}" | python -m json.tool

# Phrase search
curl -s "https://${SEARCH_SERVICE}.search.windows.net/indexes/documents-index/docs?api-version=2024-07-01&search=%22Azure+AI+services%22&\$count=true" \
  -H "api-key: ${SEARCH_KEY}" | python -m json.tool

Task 2: Full Lucene syntax

Python SDK
C# SDK
REST API

from azure.search.documents.models import QueryType

# Fuzzy search — finds "cognitive" even if user types "cogntive" (edit distance 1)
results = search_client.search(
    search_text="cogntive~1",
    query_type=QueryType.FULL
)
for result in results:
    print(f"  Fuzzy match: {result['metadata_storage_name']}")

# Wildcard search — prefix matching
results = search_client.search(
    search_text="micro*",
    query_type=QueryType.FULL
)

# Proximity search — "Azure" and "services" within 3 words of each other
results = search_client.search(
    search_text='"Azure services"~3',
    query_type=QueryType.FULL
)

# Boosted terms — "AI" is 4x more important than "cloud"
results = search_client.search(
    search_text="AI^4 cloud",
    query_type=QueryType.FULL
)

# Field-scoped search — search only in keyphrases field
results = search_client.search(
    search_text="keyphrases:machine learning",
    query_type=QueryType.FULL
)

// Fuzzy search
var fuzzyOptions = new SearchOptions { QueryType = SearchQueryType.Full };
var fuzzyResults = await searchClient.SearchAsync<SearchDocument>("cogntive~1", fuzzyOptions);

// Wildcard search
var wildcardResults = await searchClient.SearchAsync<SearchDocument>("micro*", fuzzyOptions);

// Proximity search
var proximityResults = await searchClient.SearchAsync<SearchDocument>(
    "\"Azure services\"~3", fuzzyOptions);

// Boosted terms
var boostedResults = await searchClient.SearchAsync<SearchDocument>("AI^4 cloud", fuzzyOptions);

// Field-scoped search
var fieldResults = await searchClient.SearchAsync<SearchDocument>(
    "keyphrases:\"machine learning\"", fuzzyOptions);

# Fuzzy search (queryType=full enables Lucene syntax)
curl -s -X POST "https://${SEARCH_SERVICE}.search.windows.net/indexes/documents-index/docs/search?api-version=2024-07-01" \
  -H "Content-Type: application/json" \
  -H "api-key: ${SEARCH_KEY}" \
  -d '{
    "search": "cogntive~1",
    "queryType": "full",
    "count": true
  }'

# Wildcard and boosted search
curl -s -X POST "https://${SEARCH_SERVICE}.search.windows.net/indexes/documents-index/docs/search?api-version=2024-07-01" \
  -H "Content-Type: application/json" \
  -H "api-key: ${SEARCH_KEY}" \
  -d '{
    "search": "AI^4 cloud",
    "queryType": "full",
    "count": true
  }'

Task 3: OData filters

Python SDK
C# SDK
REST API

# Filter by language
results = search_client.search(
    search_text="*",
    filter="language eq 'en'",
    include_total_count=True
)
print(f"English documents: {results.get_count()}")

# Filter with collection — any keyphrase matches
results = search_client.search(
    search_text="*",
    filter="keyphrases/any(k: k eq 'machine learning')"
)

# Combine search + filter
results = search_client.search(
    search_text="Azure",
    filter="language eq 'en' and wordCount gt 100",
    order_by=["wordCount desc"],
    select=["metadata_storage_name", "language", "wordCount"]
)

for result in results:
    print(f"  {result['metadata_storage_name']} | Words: {result.get('wordCount', 'N/A')}")

# Comparison operators: eq, ne, gt, ge, lt, le
# Logical operators: and, or, not
# Collection operators: any(), all()
# Functions: search.in(), geo.distance(), geo.intersects()
results = search_client.search(
    search_text="*",
    filter="search.in(language, 'en,fr,de', ',')"
)

// Filter by language
var filterOptions = new SearchOptions
{
    Filter = "language eq 'en'",
    IncludeTotalCount = true
};
var filtered = await searchClient.SearchAsync<SearchDocument>("*", filterOptions);
Console.WriteLine($"English documents: {filtered.Value.TotalCount}");

// Collection filter
var collectionOptions = new SearchOptions
{
    Filter = "keyphrases/any(k: k eq 'machine learning')"
};
var collResults = await searchClient.SearchAsync<SearchDocument>("*", collectionOptions);

// Combined search + filter + sort + select
var combinedOptions = new SearchOptions
{
    Filter = "language eq 'en' and wordCount gt 100",
    IncludeTotalCount = true
};
combinedOptions.OrderBy.Add("wordCount desc");
combinedOptions.Select.Add("metadata_storage_name");
combinedOptions.Select.Add("language");
combinedOptions.Select.Add("wordCount");

var combined = await searchClient.SearchAsync<SearchDocument>("Azure", combinedOptions);

# Filter with search
curl -s -X POST "https://${SEARCH_SERVICE}.search.windows.net/indexes/documents-index/docs/search?api-version=2024-07-01" \
  -H "Content-Type: application/json" \
  -H "api-key: ${SEARCH_KEY}" \
  -d '{
    "search": "Azure",
    "filter": "language eq '\''en'\'' and wordCount gt 100",
    "orderby": "wordCount desc",
    "select": "metadata_storage_name,language,wordCount",
    "count": true
  }'

# Collection filter
curl -s -X POST "https://${SEARCH_SERVICE}.search.windows.net/indexes/documents-index/docs/search?api-version=2024-07-01" \
  -H "Content-Type: application/json" \
  -H "api-key: ${SEARCH_KEY}" \
  -d '{
    "search": "*",
    "filter": "keyphrases/any(k: k eq '\''machine learning'\'')",
    "count": true
  }'

Task 4: Pagination and field selection

Python SDK
C# SDK
REST API

# Paginated results — page 1 (items 1-10)
page1 = search_client.search(
    search_text="*",
    top=10,
    skip=0,
    include_total_count=True,
    select=["metadata_storage_name", "language", "keyphrases"]
)
print(f"Total: {page1.get_count()}")
for doc in page1:
    print(f"  {doc['metadata_storage_name']}")

# Page 2 (items 11-20)
page2 = search_client.search(
    search_text="*",
    top=10,
    skip=10,
    select=["metadata_storage_name", "language", "keyphrases"]
)

# Sorting by multiple fields
results = search_client.search(
    search_text="*",
    order_by=["language asc", "metadata_storage_name asc"],
    top=20
)

// Paginated results
var pageOptions = new SearchOptions
{
    Size = 10,
    Skip = 0,
    IncludeTotalCount = true
};
pageOptions.Select.Add("metadata_storage_name");
pageOptions.Select.Add("language");
pageOptions.Select.Add("keyphrases");

var page1 = await searchClient.SearchAsync<SearchDocument>("*", pageOptions);
Console.WriteLine($"Total: {page1.Value.TotalCount}");

// Page 2
pageOptions.Skip = 10;
var page2 = await searchClient.SearchAsync<SearchDocument>("*", pageOptions);

// Multi-field sort
var sortOptions = new SearchOptions { Size = 20 };
sortOptions.OrderBy.Add("language asc");
sortOptions.OrderBy.Add("metadata_storage_name asc");
var sorted = await searchClient.SearchAsync<SearchDocument>("*", sortOptions);

# Paginated query
curl -s -X POST "https://${SEARCH_SERVICE}.search.windows.net/indexes/documents-index/docs/search?api-version=2024-07-01" \
  -H "Content-Type: application/json" \
  -H "api-key: ${SEARCH_KEY}" \
  -d '{
    "search": "*",
    "top": 10,
    "skip": 0,
    "count": true,
    "select": "metadata_storage_name,language,keyphrases",
    "orderby": "language asc, metadata_storage_name asc"
  }'

Python SDK
C# SDK
REST API

# Facets — aggregate values for building filter UI
results = search_client.search(
    search_text="*",
    facets=["language,count:10", "keyphrases,count:20"],
    include_total_count=True
)

print(f"Total results: {results.get_count()}")
print("\nLanguage facets:")
for facet in results.get_facets().get("language", []):
    print(f"  {facet['value']}: {facet['count']} documents")

print("\nTop keyphrases:")
for facet in results.get_facets().get("keyphrases", []):
    print(f"  {facet['value']}: {facet['count']} documents")

# Combine facets with a filter (drill-down)
results = search_client.search(
    search_text="*",
    filter="language eq 'en'",
    facets=["keyphrases,count:10"],
)
print("\nTop keyphrases (English only):")
for facet in results.get_facets().get("keyphrases", []):
    print(f"  {facet['value']}: {facet['count']}")

var facetOptions = new SearchOptions
{
    IncludeTotalCount = true
};
facetOptions.Facets.Add("language,count:10");
facetOptions.Facets.Add("keyphrases,count:20");

var facetResults = await searchClient.SearchAsync<SearchDocument>("*", facetOptions);
Console.WriteLine($"Total: {facetResults.Value.TotalCount}");

foreach (var facet in facetResults.Value.Facets["language"])
{
    Console.WriteLine($"  {facet.Value}: {facet.Count}");
}

// Drill-down with filter
var drillOptions = new SearchOptions { Filter = "language eq 'en'" };
drillOptions.Facets.Add("keyphrases,count:10");
var drillResults = await searchClient.SearchAsync<SearchDocument>("*", drillOptions);

# Faceted search
curl -s -X POST "https://${SEARCH_SERVICE}.search.windows.net/indexes/documents-index/docs/search?api-version=2024-07-01" \
  -H "Content-Type: application/json" \
  -H "api-key: ${SEARCH_KEY}" \
  -d '{
    "search": "*",
    "facets": ["language,count:10", "keyphrases,count:20"],
    "count": true
  }'

Expected Output

{
  "@odata.count": 42,
  "@search.facets": {
    "language": [
      {"value": "en", "count": 35},
      {"value": "fr", "count": 4},
      {"value": "de", "count": 3}
    ],
    "keyphrases": [
      {"value": "machine learning", "count": 12},
      {"value": "Azure AI", "count": 10},
      {"value": "cognitive services", "count": 8}
    ]
  },
  "value": [...]
}

Break & fix

#	Scenario	Symptom	Root Cause	Fix
1	Filter on non-filterable field	HTTP 400: "Field 'content' is not filterable"	The `content` field was defined with `filterable: false`	Update index schema to add `filterable: true` or filter on a field that is filterable
2	Facet on non-facetable field	HTTP 400: "Field is not facetable"	Field lacks `facetable` attribute in index definition	Update index to make the field facetable (requires re-index if changing type)
3	Full Lucene syntax not working	Wildcards/fuzzy treated as literal text	Missing `queryType=full` — defaults to `simple`	Set `query_type=QueryType.FULL` (Python) or `QueryType = SearchQueryType.Full` (C#)
4	$orderby fails	"Cannot sort on field 'keyphrases'"	Collection fields (`Collection(Edm.String)`) cannot be sorted	Sort on a scalar field only; use scoring profiles for relevance tuning
5	Pagination returns duplicates	Same documents appear on different pages	Index was modified between page requests; use continuation tokens for consistency	Use `search_after` for deep pagination or accept eventual consistency

Knowledge Check

1. You want users to search for 'programing' and still find documents containing 'programming'. Which query syntax supports this?

2. You need to filter documents where ANY keyphrase equals 'machine learning'. Which OData filter is correct?

3. What is the maximum value allowed for $skip in Azure AI Search?

4. A field is defined as 'searchable: true, filterable: false, facetable: true'. Which operation will FAIL?

5. You configure facets=['language,count:5']. What does the 'count:5' parameter control?

Cleanup

No additional resources created in this challenge (uses existing index from Challenge 40).

# If you want to clean up everything:
az group delete --name rg-ai102-search --yes --no-wait

Challenge 42: Search Queries — Syntax and Filters

Exam skills covered

Overview

Prerequisites

Implementation

Task 1: Simple query syntax

Task 2: Full Lucene syntax

Task 3: OData filters

Task 5: Faceted navigation

Expected Output

Break & fix

Knowledge Check

Cleanup

Learn More

Exam skills covered​

Overview​

Prerequisites​

Implementation​

Task 1: Simple query syntax​

Task 2: Full Lucene syntax​

Task 3: OData filters​

Task 4: Pagination and field selection​

Task 5: Faceted navigation​

Expected Output​

Break & fix​

Knowledge Check​

Cleanup​

Learn More​

Exam skills covered

Overview

Prerequisites

Implementation

Task 1: Simple query syntax

Task 2: Full Lucene syntax

Task 3: OData filters

Task 4: Pagination and field selection

Task 5: Faceted navigation

Expected Output

Break & fix

Knowledge Check

Cleanup

Learn More