Troubleshooting Guide - Module 9: Advanced Querying¶
This guide helps you diagnose and resolve common issues when working with advanced querying techniques in Azure AI Search.
Common Query Syntax Issues¶
1. Full Lucene Syntax Errors¶
Error Message:
Common Lucene Syntax Issues:
Unescaped Special Characters¶
# ❌ Wrong: Special characters not escaped
title:(C++ programming)
# ✅ Correct: Escape special characters
title:(C\+\+ programming)
Incorrect Field Syntax¶
# ❌ Wrong: Missing colon or parentheses
title artificial intelligence
# ✅ Correct: Proper field syntax
title:(artificial intelligence)
Malformed Boolean Expressions¶
# ❌ Wrong: Invalid boolean syntax
title:(AI) content:(machine learning) AND
# ✅ Correct: Complete boolean expression
title:(AI) AND content:(machine learning)
Debugging Tips:
1. Test query components individually
2. Use parentheses to group complex expressions
3. Escape special characters: + - & | ! ( ) { } [ ] ^ " ~ * ? : \
4. Validate field names exist in your index
2. Boosting Syntax Problems¶
Error Message:
Common Boosting Issues:
Invalid Boost Values¶
# ❌ Wrong: Invalid boost syntax
title:(machine learning)^abc
# ✅ Correct: Numeric boost values
title:(machine learning)^2.5
Misplaced Boost Operators¶
# ❌ Wrong: Boost in wrong position
title^2:(machine learning)
# ✅ Correct: Boost after term/phrase
title:(machine learning)^2
3. Fuzzy Search Issues¶
Error Message:
Common Fuzzy Search Issues:
Invalid Edit Distance¶
Fuzzy Search on Short Terms¶
# ❌ Inefficient: Fuzzy on very short terms
AI~1
# ✅ Better: Use fuzzy on longer terms
artificial~1 intelligence~1
Scoring Profile Issues¶
1. Scoring Profile Not Found¶
Error Message:
Cause: The scoring profile is not defined in the index schema.
Solution: 1. Verify the scoring profile exists in your index definition:
{
"scoringProfiles": [
{
"name": "boost-recent",
"text": {
"weights": {
"title": 2.0,
"content": 1.0
}
}
}
]
}
- Update your index with the scoring profile
- Ensure the profile name matches exactly (case-sensitive)
2. Invalid Scoring Function Configuration¶
Error Message:
Common Function Issues:
Invalid Field References¶
// ❌ Wrong: Field doesn't exist or isn't configured properly
{
"type": "freshness",
"fieldName": "nonexistentField",
"boost": 1.5
}
// ✅ Correct: Valid field with proper type
{
"type": "freshness",
"fieldName": "publishedDate",
"boost": 1.5,
"freshness": {
"boostingDuration": "P30D"
}
}
Invalid Function Parameters¶
// ❌ Wrong: Invalid duration format
{
"freshness": {
"boostingDuration": "30 days"
}
}
// ✅ Correct: ISO 8601 duration format
{
"freshness": {
"boostingDuration": "P30D"
}
}
Suggestion and Autocomplete Issues¶
1. Suggester Not Found¶
Error Message:
Solution: 1. Verify suggester is defined in index schema:
{
"suggesters": [
{
"name": "content-suggester",
"searchMode": "analyzingInfixMatching",
"sourceFields": ["title", "description", "tags"]
}
]
}
- Ensure source fields are searchable
- Rebuild index if suggester was added after initial creation
2. Poor Suggestion Quality¶
Symptoms: - Irrelevant suggestions - Too few suggestions - Suggestions don't match user input
Solutions:
Optimize Source Fields¶
// ✅ Good: Include relevant, high-quality fields
{
"sourceFields": ["title", "category", "tags"]
}
// ❌ Avoid: Including noisy or irrelevant fields
{
"sourceFields": ["title", "content", "metadata", "internalId"]
}
Implement Suggestion Filtering¶
// ✅ Good: Filter and rank suggestions
async function getQualitySuggestions(searchClient, partialText) {
const suggestions = await searchClient.suggest(partialText, 'content-suggester', {
top: 20 // Get more than needed
});
// Filter and rank suggestions
const filtered = [];
const seen = new Set();
for await (const suggestion of suggestions.results) {
const text = suggestion['@search.text'].toLowerCase();
// Avoid duplicates and very short suggestions
if (!seen.has(text) && text.length > 2) {
filtered.push(suggestion);
seen.add(text);
if (filtered.length >= 8) break;
}
}
return filtered;
}
Performance Issues¶
1. Slow Query Performance¶
Symptoms: - Queries taking longer than expected - Timeouts on complex queries - High resource usage
Diagnostic Steps:
Analyze Query Complexity¶
// ✅ Good: Measure query performance
async function analyzeQueryPerformance(searchClient, query, options) {
const startTime = Date.now();
try {
const results = await searchClient.search(query, options);
const executionTime = Date.now() - startTime;
console.log(`Query: "${query}"`);
console.log(`Execution time: ${executionTime}ms`);
console.log(`Result count: ${results.count}`);
console.log(`Query type: ${options.queryType}`);
if (executionTime > 1000) {
console.warn('Slow query detected - consider optimization');
}
return results;
} catch (error) {
console.error(`Query failed after ${Date.now() - startTime}ms:`, error);
throw error;
}
}
Common Performance Issues¶
Inefficient Wildcard Patterns
Overly Complex Boolean Logic
# ❌ Slow: Too many OR clauses
term1 OR term2 OR term3 OR ... OR term50
# ✅ Faster: Use search.in() function or simplify
search.in(field, 'term1,term2,term3', ',')
Excessive Fuzzy Search
# ❌ Slow: High edit distance on multiple terms
machine~2 learning~2 artificial~2 intelligence~2
# ✅ Faster: Lower edit distance, selective application
machine~1 learning~1 artificial intelligence
2. Memory and Resource Issues¶
Error Message:
Solutions:
Limit Result Sets¶
// ✅ Good: Reasonable result limits
const results = await searchClient.search(query, {
top: 50, // Don't request more than needed
select: ['id', 'title', 'description'], // Limit fields
queryType: 'full'
});
Optimize Field Selection¶
// ✅ Good: Select only necessary fields
{
select: ['id', 'title', 'summary'],
highlight: 'title,summary' // Don't highlight large fields
}
// ❌ Avoid: Returning large content fields
{
select: ['*'], // Returns all fields including large content
highlight: 'content' // Highlighting large fields
}
Relevance and Scoring Issues¶
1. Poor Search Relevance¶
Symptoms: - Irrelevant results appearing first - Expected results not appearing in top results - Inconsistent ranking across similar queries
Diagnostic Approach:
Analyze Scoring Details¶
// ✅ Good: Enable scoring details for analysis
const results = await searchClient.search(query, {
queryType: 'full',
scoringStatistics: 'global',
top: 10
});
// Examine scoring details
for await (const result of results.results) {
console.log(`Document: ${result.document.title}`);
console.log(`Score: ${result['@search.score']}`);
console.log(`Scoring details:`, result['@search.scoringStatistics']);
}
Test Different Scoring Profiles¶
// ✅ Good: Compare scoring profiles
async function compareScoring(searchClient, query) {
const profiles = ['default', 'boost-recent', 'boost-popular'];
const results = {};
for (const profile of profiles) {
const searchResults = await searchClient.search(query, {
scoringProfile: profile === 'default' ? undefined : profile,
top: 5
});
results[profile] = Array.from(searchResults.results);
}
return results;
}
2. Scoring Profile Not Working¶
Symptoms: - Scoring profile applied but no change in results - Unexpected scoring behavior
Common Issues:
Field Weight Problems¶
// ❌ Problem: Weights on non-searchable fields
{
"text": {
"weights": {
"id": 2.0, // ID field typically not searchable
"title": 1.0
}
}
}
// ✅ Solution: Weights only on searchable fields
{
"text": {
"weights": {
"title": 2.0,
"content": 1.0,
"description": 1.5
}
}
}
Function Field Issues¶
// ❌ Problem: Function on wrong field type
{
"type": "freshness",
"fieldName": "title", // String field, not date
"boost": 1.5
}
// ✅ Solution: Function on appropriate field type
{
"type": "freshness",
"fieldName": "publishedDate", // DateTimeOffset field
"boost": 1.5,
"freshness": {
"boostingDuration": "P30D"
}
}
Index Configuration Issues¶
1. Field Not Searchable for Advanced Queries¶
Error Message:
Solution: 1. Verify field exists and is marked as searchable:
{
"name": "title",
"type": "Edm.String",
"searchable": true, // Required for field-specific queries
"analyzer": "en.microsoft"
}
- Update index schema if needed
- Rebuild index with new schema
2. Analyzer Configuration Issues¶
Symptoms: - Unexpected tokenization behavior - Search not finding expected matches - Language-specific search not working
Solutions:
Verify Analyzer Configuration¶
// ✅ Good: Appropriate analyzer for content type
{
"name": "title",
"type": "Edm.String",
"searchable": true,
"analyzer": "en.microsoft" // Language-specific analyzer
}
// ✅ Good: Keyword analyzer for exact matching
{
"name": "productCode",
"type": "Edm.String",
"searchable": true,
"analyzer": "keyword" // No tokenization
}
Test Analyzer Behavior¶
POST https://[service-name].search.windows.net/indexes/[index-name]/analyze?api-version=2024-07-01
Content-Type: application/json
api-key: [admin-key]
{
"text": "machine learning",
"analyzer": "en.microsoft"
}
Debugging Strategies¶
1. Systematic Query Testing¶
Step-by-Step Approach:
-
Test Basic Components
-
Add Complexity Gradually
-
Validate Advanced Features
2. Query Validation Tools¶
REST API Testing¶
# Use curl to test queries directly
curl -X POST "https://[service].search.windows.net/indexes/[index]/docs/search?api-version=2024-07-01" \
-H "Content-Type: application/json" \
-H "api-key: [key]" \
-d '{
"search": "title:(machine learning)^2",
"queryType": "full",
"top": 5
}'
SDK Debugging¶
# Python example with detailed error handling
try:
results = search_client.search(
search_text="title:(machine learning)^2",
query_type="full",
top=10,
include_total_count=True
)
print(f"Query executed successfully")
print(f"Total results: {results.get_count()}")
for result in results:
print(f"Score: {result['@search.score']}")
print(f"Title: {result['title']}")
except Exception as e:
print(f"Query failed: {str(e)}")
print(f"Query: title:(machine learning)^2")
print(f"Query type: full")
# Try simpler version
try:
simple_results = search_client.search(
search_text="machine learning",
query_type="simple",
top=10
)
print("Simple query worked - issue with Lucene syntax")
except Exception as simple_error:
print(f"Simple query also failed: {str(simple_error)}")
3. Performance Monitoring¶
Query Performance Tracking¶
class QueryPerformanceMonitor {
constructor() {
this.metrics = [];
}
async monitorQuery(searchClient, query, options) {
const startTime = Date.now();
const queryHash = this.hashQuery(query, options);
try {
const results = await searchClient.search(query, options);
const endTime = Date.now();
const metric = {
queryHash,
query: query.substring(0, 100), // Truncate for logging
executionTime: endTime - startTime,
resultCount: results.count,
queryType: options.queryType,
scoringProfile: options.scoringProfile,
success: true,
timestamp: new Date()
};
this.metrics.push(metric);
if (metric.executionTime > 2000) {
console.warn('Slow query detected:', metric);
}
return results;
} catch (error) {
const endTime = Date.now();
const metric = {
queryHash,
query: query.substring(0, 100),
executionTime: endTime - startTime,
error: error.message,
success: false,
timestamp: new Date()
};
this.metrics.push(metric);
console.error('Query failed:', metric);
throw error;
}
}
hashQuery(query, options) {
return btoa(JSON.stringify({ query, options })).substring(0, 16);
}
getPerformanceReport() {
const successful = this.metrics.filter(m => m.success);
const failed = this.metrics.filter(m => !m.success);
return {
totalQueries: this.metrics.length,
successfulQueries: successful.length,
failedQueries: failed.length,
averageExecutionTime: successful.reduce((sum, m) => sum + m.executionTime, 0) / successful.length,
slowQueries: successful.filter(m => m.executionTime > 1000),
commonErrors: this.groupBy(failed, 'error')
};
}
groupBy(array, key) {
return array.reduce((groups, item) => {
const group = item[key];
groups[group] = groups[group] || [];
groups[group].push(item);
return groups;
}, {});
}
}
Getting Additional Help¶
Microsoft Resources¶
- Azure AI Search Documentation
- Lucene Query Syntax Reference
- Scoring Profiles Documentation
- Search Suggestions Guide
Community Support¶
Professional Support¶
- Azure Support Plans
- Microsoft Professional Services
- Azure AI Search Consulting Partners
Quick Reference Checklist¶
When troubleshooting advanced queries:
- [ ] Verify query syntax is valid for the specified query type
- [ ] Check that all referenced fields exist and are searchable
- [ ] Validate scoring profiles are defined in the index schema
- [ ] Test query components individually before combining
- [ ] Monitor query performance and resource usage
- [ ] Use appropriate edit distances for fuzzy search
- [ ] Escape special characters in Lucene queries
- [ ] Verify suggester configuration and source fields
- [ ] Check analyzer configuration for expected tokenization
- [ ] Document working solutions for future reference
This troubleshooting guide should help you quickly identify and resolve common issues with advanced querying in Azure AI Search, ensuring optimal performance and relevance.