Troubleshooting Guide - Module 9: Advanced Querying¶

This guide helps you diagnose and resolve common issues when working with advanced querying techniques in Azure AI Search.

Common Query Syntax Issues¶

1. Full Lucene Syntax Errors¶

Error Message:

Invalid query syntax: Syntax error at position X in 'query expression'

Common Lucene Syntax Issues:

Unescaped Special Characters¶

# ❌ Wrong: Special characters not escaped
title:(C++ programming)

# ✅ Correct: Escape special characters
title:(C\+\+ programming)

Incorrect Field Syntax¶

# ❌ Wrong: Missing colon or parentheses
title artificial intelligence

# ✅ Correct: Proper field syntax
title:(artificial intelligence)

Malformed Boolean Expressions¶

# ❌ Wrong: Invalid boolean syntax
title:(AI) content:(machine learning) AND

# ✅ Correct: Complete boolean expression
title:(AI) AND content:(machine learning)

Debugging Tips: 1. Test query components individually 2. Use parentheses to group complex expressions 3. Escape special characters: + - & | ! ( ) { } [ ] ^ " ~ * ? : \ 4. Validate field names exist in your index

2. Boosting Syntax Problems¶

Error Message:

Invalid boost value or syntax

Common Boosting Issues:

Invalid Boost Values¶

# ❌ Wrong: Invalid boost syntax
title:(machine learning)^abc

# ✅ Correct: Numeric boost values
title:(machine learning)^2.5

Misplaced Boost Operators¶

# ❌ Wrong: Boost in wrong position
title^2:(machine learning)

# ✅ Correct: Boost after term/phrase
title:(machine learning)^2

3. Fuzzy Search Issues¶

Error Message:

Invalid fuzzy search syntax or edit distance

Common Fuzzy Search Issues:

Invalid Edit Distance¶

# ❌ Wrong: Edit distance too high
machine~5

# ✅ Correct: Reasonable edit distance (0-2)
machine~1

Fuzzy Search on Short Terms¶

# ❌ Inefficient: Fuzzy on very short terms
AI~1

# ✅ Better: Use fuzzy on longer terms
artificial~1 intelligence~1

Scoring Profile Issues¶

1. Scoring Profile Not Found¶

Error Message:

Scoring profile 'profile-name' not found

Cause: The scoring profile is not defined in the index schema.

Solution: 1. Verify the scoring profile exists in your index definition:

{
  "scoringProfiles": [
    {
      "name": "boost-recent",
      "text": {
        "weights": {
          "title": 2.0,
          "content": 1.0
        }
      }
    }
  ]
}

Update your index with the scoring profile
Ensure the profile name matches exactly (case-sensitive)

2. Invalid Scoring Function Configuration¶

Error Message:

Invalid scoring function configuration

Common Function Issues:

Invalid Field References¶

// ❌ Wrong: Field doesn't exist or isn't configured properly
{
  "type": "freshness",
  "fieldName": "nonexistentField",
  "boost": 1.5
}

// ✅ Correct: Valid field with proper type
{
  "type": "freshness",
  "fieldName": "publishedDate",
  "boost": 1.5,
  "freshness": {
    "boostingDuration": "P30D"
  }
}

Invalid Function Parameters¶

// ❌ Wrong: Invalid duration format
{
  "freshness": {
    "boostingDuration": "30 days"
  }
}

// ✅ Correct: ISO 8601 duration format
{
  "freshness": {
    "boostingDuration": "P30D"
  }
}

Suggestion and Autocomplete Issues¶

1. Suggester Not Found¶

Error Message:

Suggester 'suggester-name' not found

Solution: 1. Verify suggester is defined in index schema:

{
  "suggesters": [
    {
      "name": "content-suggester",
      "searchMode": "analyzingInfixMatching",
      "sourceFields": ["title", "description", "tags"]
    }
  ]
}

Ensure source fields are searchable
Rebuild index if suggester was added after initial creation

2. Poor Suggestion Quality¶

Symptoms: - Irrelevant suggestions - Too few suggestions - Suggestions don't match user input

Solutions:

Optimize Source Fields¶

// ✅ Good: Include relevant, high-quality fields
{
  "sourceFields": ["title", "category", "tags"]
}

// ❌ Avoid: Including noisy or irrelevant fields
{
  "sourceFields": ["title", "content", "metadata", "internalId"]
}

Implement Suggestion Filtering¶

// ✅ Good: Filter and rank suggestions
async function getQualitySuggestions(searchClient, partialText) {
    const suggestions = await searchClient.suggest(partialText, 'content-suggester', {
        top: 20  // Get more than needed
    });

    // Filter and rank suggestions
    const filtered = [];
    const seen = new Set();

    for await (const suggestion of suggestions.results) {
        const text = suggestion['@search.text'].toLowerCase();

        // Avoid duplicates and very short suggestions
        if (!seen.has(text) && text.length > 2) {
            filtered.push(suggestion);
            seen.add(text);

            if (filtered.length >= 8) break;
        }
    }

    return filtered;
}

Performance Issues¶

1. Slow Query Performance¶

Symptoms: - Queries taking longer than expected - Timeouts on complex queries - High resource usage

Diagnostic Steps:

Analyze Query Complexity¶

// ✅ Good: Measure query performance
async function analyzeQueryPerformance(searchClient, query, options) {
    const startTime = Date.now();

    try {
        const results = await searchClient.search(query, options);
        const executionTime = Date.now() - startTime;

        console.log(`Query: "${query}"`);
        console.log(`Execution time: ${executionTime}ms`);
        console.log(`Result count: ${results.count}`);
        console.log(`Query type: ${options.queryType}`);

        if (executionTime > 1000) {
            console.warn('Slow query detected - consider optimization');
        }

        return results;
    } catch (error) {
        console.error(`Query failed after ${Date.now() - startTime}ms:`, error);
        throw error;
    }
}

Common Performance Issues¶

Inefficient Wildcard Patterns

# ❌ Slow: Leading wildcards
*machine*

# ✅ Faster: Trailing wildcards
machine*

Overly Complex Boolean Logic

# ❌ Slow: Too many OR clauses
term1 OR term2 OR term3 OR ... OR term50

# ✅ Faster: Use search.in() function or simplify
search.in(field, 'term1,term2,term3', ',')

Excessive Fuzzy Search

# ❌ Slow: High edit distance on multiple terms
machine~2 learning~2 artificial~2 intelligence~2

# ✅ Faster: Lower edit distance, selective application
machine~1 learning~1 artificial intelligence

2. Memory and Resource Issues¶

Error Message:

Request timeout or resource exhaustion

Solutions:

Limit Result Sets¶

// ✅ Good: Reasonable result limits
const results = await searchClient.search(query, {
    top: 50,  // Don't request more than needed
    select: ['id', 'title', 'description'],  // Limit fields
    queryType: 'full'
});

Optimize Field Selection¶

// ✅ Good: Select only necessary fields
{
    select: ['id', 'title', 'summary'],
    highlight: 'title,summary'  // Don't highlight large fields
}

// ❌ Avoid: Returning large content fields
{
    select: ['*'],  // Returns all fields including large content
    highlight: 'content'  // Highlighting large fields
}

Relevance and Scoring Issues¶

1. Poor Search Relevance¶

Symptoms: - Irrelevant results appearing first - Expected results not appearing in top results - Inconsistent ranking across similar queries

Diagnostic Approach:

Analyze Scoring Details¶

// ✅ Good: Enable scoring details for analysis
const results = await searchClient.search(query, {
    queryType: 'full',
    scoringStatistics: 'global',
    top: 10
});

// Examine scoring details
for await (const result of results.results) {
    console.log(`Document: ${result.document.title}`);
    console.log(`Score: ${result['@search.score']}`);
    console.log(`Scoring details:`, result['@search.scoringStatistics']);
}

Test Different Scoring Profiles¶

// ✅ Good: Compare scoring profiles
async function compareScoring(searchClient, query) {
    const profiles = ['default', 'boost-recent', 'boost-popular'];
    const results = {};

    for (const profile of profiles) {
        const searchResults = await searchClient.search(query, {
            scoringProfile: profile === 'default' ? undefined : profile,
            top: 5
        });

        results[profile] = Array.from(searchResults.results);
    }

    return results;
}

2. Scoring Profile Not Working¶

Symptoms: - Scoring profile applied but no change in results - Unexpected scoring behavior

Common Issues:

Field Weight Problems¶

// ❌ Problem: Weights on non-searchable fields
{
  "text": {
    "weights": {
      "id": 2.0,  // ID field typically not searchable
      "title": 1.0
    }
  }
}

// ✅ Solution: Weights only on searchable fields
{
  "text": {
    "weights": {
      "title": 2.0,
      "content": 1.0,
      "description": 1.5
    }
  }
}

Function Field Issues¶

// ❌ Problem: Function on wrong field type
{
  "type": "freshness",
  "fieldName": "title",  // String field, not date
  "boost": 1.5
}

// ✅ Solution: Function on appropriate field type
{
  "type": "freshness",
  "fieldName": "publishedDate",  // DateTimeOffset field
  "boost": 1.5,
  "freshness": {
    "boostingDuration": "P30D"
  }
}

Index Configuration Issues¶

1. Field Not Searchable for Advanced Queries¶

Error Message:

Field 'fieldname' is not searchable or does not exist

Solution: 1. Verify field exists and is marked as searchable:

{
  "name": "title",
  "type": "Edm.String",
  "searchable": true,  // Required for field-specific queries
  "analyzer": "en.microsoft"
}

Update index schema if needed
Rebuild index with new schema

2. Analyzer Configuration Issues¶

Symptoms: - Unexpected tokenization behavior - Search not finding expected matches - Language-specific search not working

Solutions:

Verify Analyzer Configuration¶

// ✅ Good: Appropriate analyzer for content type
{
  "name": "title",
  "type": "Edm.String",
  "searchable": true,
  "analyzer": "en.microsoft"  // Language-specific analyzer
}

// ✅ Good: Keyword analyzer for exact matching
{
  "name": "productCode",
  "type": "Edm.String",
  "searchable": true,
  "analyzer": "keyword"  // No tokenization
}

Test Analyzer Behavior¶

POST https://[service-name].search.windows.net/indexes/[index-name]/analyze?api-version=2024-07-01
Content-Type: application/json
api-key: [admin-key]

{
  "text": "machine learning",
  "analyzer": "en.microsoft"
}

Debugging Strategies¶

1. Systematic Query Testing¶

Step-by-Step Approach:

Test Basic Components

# Test individual terms
machine
learning

# Test simple combinations
machine learning
machine AND learning

Add Complexity Gradually

# Add field targeting
title:(machine learning)

# Add boosting
title:(machine learning)^2

# Add boolean logic
title:(machine learning)^2 OR content:(artificial intelligence)

Validate Advanced Features

# Test fuzzy search
machine~1 learning~1

# Test wildcards
mach* learn*

# Test proximity
"machine learning"~5

2. Query Validation Tools¶

REST API Testing¶

# Use curl to test queries directly
curl -X POST "https://[service].search.windows.net/indexes/[index]/docs/search?api-version=2024-07-01" \
  -H "Content-Type: application/json" \
  -H "api-key: [key]" \
  -d '{
    "search": "title:(machine learning)^2",
    "queryType": "full",
    "top": 5
  }'

SDK Debugging¶

# Python example with detailed error handling
try:
    results = search_client.search(
        search_text="title:(machine learning)^2",
        query_type="full",
        top=10,
        include_total_count=True
    )

    print(f"Query executed successfully")
    print(f"Total results: {results.get_count()}")

    for result in results:
        print(f"Score: {result['@search.score']}")
        print(f"Title: {result['title']}")

except Exception as e:
    print(f"Query failed: {str(e)}")
    print(f"Query: title:(machine learning)^2")
    print(f"Query type: full")

    # Try simpler version
    try:
        simple_results = search_client.search(
            search_text="machine learning",
            query_type="simple",
            top=10
        )
        print("Simple query worked - issue with Lucene syntax")
    except Exception as simple_error:
        print(f"Simple query also failed: {str(simple_error)}")

3. Performance Monitoring¶

Query Performance Tracking¶

class QueryPerformanceMonitor {
    constructor() {
        this.metrics = [];
    }

    async monitorQuery(searchClient, query, options) {
        const startTime = Date.now();
        const queryHash = this.hashQuery(query, options);

        try {
            const results = await searchClient.search(query, options);
            const endTime = Date.now();

            const metric = {
                queryHash,
                query: query.substring(0, 100), // Truncate for logging
                executionTime: endTime - startTime,
                resultCount: results.count,
                queryType: options.queryType,
                scoringProfile: options.scoringProfile,
                success: true,
                timestamp: new Date()
            };

            this.metrics.push(metric);

            if (metric.executionTime > 2000) {
                console.warn('Slow query detected:', metric);
            }

            return results;

        } catch (error) {
            const endTime = Date.now();

            const metric = {
                queryHash,
                query: query.substring(0, 100),
                executionTime: endTime - startTime,
                error: error.message,
                success: false,
                timestamp: new Date()
            };

            this.metrics.push(metric);
            console.error('Query failed:', metric);

            throw error;
        }
    }

    hashQuery(query, options) {
        return btoa(JSON.stringify({ query, options })).substring(0, 16);
    }

    getPerformanceReport() {
        const successful = this.metrics.filter(m => m.success);
        const failed = this.metrics.filter(m => !m.success);

        return {
            totalQueries: this.metrics.length,
            successfulQueries: successful.length,
            failedQueries: failed.length,
            averageExecutionTime: successful.reduce((sum, m) => sum + m.executionTime, 0) / successful.length,
            slowQueries: successful.filter(m => m.executionTime > 1000),
            commonErrors: this.groupBy(failed, 'error')
        };
    }

    groupBy(array, key) {
        return array.reduce((groups, item) => {
            const group = item[key];
            groups[group] = groups[group] || [];
            groups[group].push(item);
            return groups;
        }, {});
    }
}

Getting Additional Help¶

Microsoft Resources¶

Community Support¶

Professional Support¶

Azure Support Plans
Microsoft Professional Services
Azure AI Search Consulting Partners

Quick Reference Checklist¶

When troubleshooting advanced queries:

[ ] Verify query syntax is valid for the specified query type
[ ] Check that all referenced fields exist and are searchable
[ ] Validate scoring profiles are defined in the index schema
[ ] Test query components individually before combining
[ ] Monitor query performance and resource usage
[ ] Use appropriate edit distances for fuzzy search
[ ] Escape special characters in Lucene queries
[ ] Verify suggester configuration and source fields
[ ] Check analyzer configuration for expected tokenization
[ ] Document working solutions for future reference

This troubleshooting guide should help you quickly identify and resolve common issues with advanced querying in Azure AI Search, ensuring optimal performance and relevance.