Best Practices - Advanced Querying¶

Overview¶

This guide provides best practices for implementing advanced querying techniques in Azure AI Search. Following these guidelines will help you create high-performance, relevant, and maintainable search experiences.

Query Construction Best Practices¶

1. Choose the Right Query Type¶

Simple Query Syntax¶

// ✅ Good: Use simple syntax for basic searches
{
  "search": "machine learning",
  "queryType": "simple",
  "searchMode": "any"
}

When to use: - Basic text search with AND/OR/NOT operators - User-facing search boxes where syntax errors should be avoided - Simple wildcard searches with * and ?

Full Lucene Query Syntax¶

// ✅ Good: Use full Lucene for advanced features
{
  "search": "title:(artificial intelligence) AND content:(machine learning)^2",
  "queryType": "full"
}

When to use: - Field-specific searches - Term and phrase boosting - Fuzzy search and proximity search - Complex boolean logic with parentheses

2. Effective Boosting Strategies¶

Term Boosting¶

// ✅ Good: Boost important terms appropriately
{
  "search": "artificial^2 intelligence machine^1.5 learning",
  "queryType": "full"
}

// ❌ Avoid: Excessive boosting that skews results
{
  "search": "artificial^10 intelligence^8 machine^5 learning",
  "queryType": "full"
}

Field Boosting¶

// ✅ Good: Prioritize title matches over content
{
  "search": "title:(machine learning)^3 OR content:(machine learning)",
  "queryType": "full"
}

Scoring Profile Usage¶

// ✅ Good: Use scoring profiles for consistent relevance
{
  "search": "machine learning",
  "scoringProfile": "boost-recent-popular"
}

3. Fuzzy Search Optimization¶

Appropriate Edit Distance¶

// ✅ Good: Use edit distance 1 for most cases
{
  "search": "machne~1 learning",
  "queryType": "full"
}

// ❌ Avoid: High edit distance that returns irrelevant results
{
  "search": "machne~3 learning",
  "queryType": "full"
}

Combine with Other Operators¶

// ✅ Good: Combine fuzzy search with exact matches
{
  "search": "(machine learning) OR (machne~1 learing~1)",
  "queryType": "full"
}

4. Wildcard Search Guidelines¶

Efficient Wildcard Patterns¶

// ✅ Good: Suffix wildcards are most efficient
{
  "search": "tech*",
  "queryType": "full"
}

// ⚠️ Caution: Prefix wildcards can be slower
{
  "search": "*ology",
  "queryType": "full"
}

// ❌ Avoid: Leading wildcards on short terms
{
  "search": "*ai*",
  "queryType": "full"
}

Combine with Field Restrictions¶

// ✅ Good: Limit wildcard searches to specific fields
{
  "search": "title:tech* OR category:tech*",
  "queryType": "full"
}

5. Proximity Search Best Practices¶

Reasonable Proximity Distance¶

// ✅ Good: Use appropriate proximity distance
{
  "search": "\"machine learning\"~5",
  "queryType": "full"
}

// ❌ Avoid: Excessive proximity distance
{
  "search": "\"machine learning\"~50",
  "queryType": "full"
}

Combine with Exact Phrases¶

// ✅ Good: Boost exact phrases over proximity matches
{
  "search": "(\"machine learning\")^2 OR (\"machine learning\"~3)",
  "queryType": "full"
}

Relevance Tuning Best Practices¶

1. Scoring Profile Design¶

Balanced Weight Distribution¶

// ✅ Good: Balanced field weights
{
  "scoringProfiles": [
    {
      "name": "balanced-relevance",
      "text": {
        "weights": {
          "title": 2.0,
          "description": 1.5,
          "content": 1.0,
          "tags": 1.2
        }
      }
    }
  ]
}

// ❌ Avoid: Extreme weight differences
{
  "text": {
    "weights": {
      "title": 10.0,
      "content": 0.1
    }
  }
}

Effective Function Scoring¶

// ✅ Good: Reasonable freshness boosting
{
  "functions": [
    {
      "type": "freshness",
      "fieldName": "publishedDate",
      "boost": 1.5,
      "interpolation": "linear",
      "freshness": {
        "boostingDuration": "P30D"
      }
    }
  ]
}

2. Multi-Field Search Strategies¶

Prioritized Field Searching¶

// ✅ Good: Search most important fields first
{
  "search": "title:(artificial intelligence)^3 OR description:(artificial intelligence)^2 OR content:(artificial intelligence)",
  "queryType": "full"
}

Cross-Field Boosting¶

// ✅ Good: Boost documents where terms appear in multiple fields
{
  "search": "(title:machine AND description:learning)^2 OR (machine learning)",
  "queryType": "full"
}

Performance Optimization¶

1. Query Efficiency¶

Limit Search Scope¶

// ✅ Good: Use searchFields to limit scope
{
  "search": "machine learning",
  "searchFields": "title,description,content"
}

// ✅ Good: Combine with filters for better performance
{
  "search": "machine learning",
  "filter": "category eq 'Technology' and publishedDate ge 2023-01-01T00:00:00Z"
}

Optimize Result Sets¶

// ✅ Good: Request only needed fields
{
  "search": "machine learning",
  "select": "id,title,description,rating",
  "top": 20
}

2. Caching Strategies¶

Cache Common Queries¶

// ✅ Good: Cache frequent search patterns
const queryCache = new Map();
const cacheKey = `${searchText}_${filters}_${orderBy}`;

if (queryCache.has(cacheKey)) {
    return queryCache.get(cacheKey);
}

const results = await searchClient.search(searchText, options);
queryCache.set(cacheKey, results);

Cache Suggestions¶

// ✅ Good: Cache suggestion results
const suggestionCache = new Map();
const suggestionKey = `suggest_${partialText}`;

if (suggestionCache.has(suggestionKey)) {
    return suggestionCache.get(suggestionKey);
}

3. Index Design for Advanced Queries¶

Appropriate Analyzers¶

// ✅ Good: Use language-specific analyzers
{
  "name": "title",
  "type": "Edm.String",
  "searchable": true,
  "analyzer": "en.microsoft"
}

// ✅ Good: Use keyword analyzer for exact matching
{
  "name": "productCode",
  "type": "Edm.String",
  "searchable": true,
  "analyzer": "keyword"
}

Strategic Field Configuration¶

// ✅ Good: Configure fields based on usage patterns
{
  "name": "content",
  "type": "Edm.String",
  "searchable": true,
  "retrievable": false,  // Don't return large content in results
  "analyzer": "en.microsoft"
}

Search Experience Best Practices¶

1. Suggestion Implementation¶

Effective Suggester Configuration¶

// ✅ Good: Include relevant fields in suggester
{
  "suggesters": [
    {
      "name": "product-suggester",
      "searchMode": "analyzingInfixMatching",
      "sourceFields": ["title", "category", "brand", "tags"]
    }
  ]
}

Smart Suggestion Logic¶

// ✅ Good: Implement intelligent suggestion fallback
async function getSmartSuggestions(partialText) {
    // Try exact suggestions first
    let suggestions = await getSuggestions(partialText);

    if (suggestions.length < 3) {
        // Fall back to fuzzy suggestions
        const fuzzyQuery = `${partialText}~1`;
        const fuzzyResults = await searchClient.search(fuzzyQuery, {
            queryType: "full",
            top: 5,
            select: "title"
        });

        // Add fuzzy results to suggestions
        for await (const result of fuzzyResults.results) {
            suggestions.push({
                text: result.document.title,
                queryPlusText: result.document.title
            });
        }
    }

    return suggestions;
}

2. Hit Highlighting¶

Effective Highlighting Configuration¶

// ✅ Good: Highlight relevant fields with appropriate tags
{
  "search": "machine learning",
  "highlight": "title,description,content",
  "highlightPreTag": "<mark>",
  "highlightPostTag": "</mark>",
  "top": 10
}

Smart Highlighting Display¶

// ✅ Good: Prioritize highlighted fields in display
function formatHighlights(result) {
    const highlights = result['@search.highlights'];

    if (highlights) {
        // Prioritize title highlights
        if (highlights.title && highlights.title.length > 0) {
            return highlights.title[0];
        }

        // Fall back to description highlights
        if (highlights.description && highlights.description.length > 0) {
            return highlights.description[0];
        }

        // Finally use content highlights
        if (highlights.content && highlights.content.length > 0) {
            return highlights.content[0];
        }
    }

    // Return original text if no highlights
    return result.document.title || result.document.description;
}

3. Query Expansion and Enhancement¶

Automatic Query Enhancement¶

// ✅ Good: Implement intelligent query expansion
function enhanceQuery(originalQuery) {
    const synonyms = {
        'AI': ['artificial intelligence', 'machine learning'],
        'ML': ['machine learning', 'artificial intelligence'],
        'tech': ['technology', 'technical']
    };

    let enhancedQuery = originalQuery;

    // Add synonyms with lower boost
    Object.entries(synonyms).forEach(([term, syns]) => {
        if (originalQuery.toLowerCase().includes(term.toLowerCase())) {
            const synonymQuery = syns.map(syn => `(${syn})`).join(' OR ');
            enhancedQuery += ` OR (${synonymQuery})^0.5`;
        }
    });

    return enhancedQuery;
}

Error Handling and Resilience¶

1. Query Validation¶

Input Sanitization¶

// ✅ Good: Sanitize user input
function sanitizeQuery(userInput) {
    // Remove potentially problematic characters
    let sanitized = userInput.replace(/[<>]/g, '');

    // Escape special Lucene characters if using full syntax
    if (queryType === 'full') {
        sanitized = sanitized.replace(/([+\-&|!(){}[\]^"~*?:\\])/g, '\\$1');
    }

    return sanitized;
}

Query Complexity Limits¶

// ✅ Good: Limit query complexity
function validateQueryComplexity(query) {
    const maxLength = 1000;
    const maxClauses = 50;

    if (query.length > maxLength) {
        throw new Error('Query too long');
    }

    const clauseCount = (query.match(/AND|OR/gi) || []).length + 1;
    if (clauseCount > maxClauses) {
        throw new Error('Query too complex');
    }

    return true;
}

2. Graceful Degradation¶

Fallback Query Strategies¶

// ✅ Good: Implement query fallback
async function robustSearch(query, options) {
    try {
        // Try advanced query first
        return await searchClient.search(query, {
            ...options,
            queryType: 'full'
        });
    } catch (error) {
        if (error.message.includes('syntax')) {
            // Fall back to simple query
            console.warn('Advanced query failed, falling back to simple syntax');
            return await searchClient.search(query, {
                ...options,
                queryType: 'simple'
            });
        }
        throw error;
    }
}

Monitoring and Analytics¶

1. Query Performance Tracking¶

Performance Metrics Collection¶

// ✅ Good: Track query performance
async function monitoredSearch(query, options) {
    const startTime = Date.now();
    const queryHash = hashQuery(query, options);

    try {
        const results = await searchClient.search(query, options);
        const duration = Date.now() - startTime;

        // Log performance metrics
        logMetrics({
            queryHash,
            duration,
            resultCount: results.count,
            query: query.substring(0, 100), // Truncate for privacy
            success: true
        });

        return results;
    } catch (error) {
        const duration = Date.now() - startTime;

        logMetrics({
            queryHash,
            duration,
            error: error.message,
            success: false
        });

        throw error;
    }
}

2. Search Analytics¶

User Behavior Tracking¶

// ✅ Good: Track search patterns
function trackSearchBehavior(query, results, userActions) {
    analytics.track('search_performed', {
        query: hashQuery(query), // Hash for privacy
        resultCount: results.count,
        hasResults: results.count > 0,
        queryType: results.queryType,
        timestamp: new Date().toISOString()
    });

    // Track user interactions with results
    userActions.forEach(action => {
        analytics.track('search_result_interaction', {
            queryHash: hashQuery(query),
            action: action.type, // click, view, etc.
            resultPosition: action.position,
            resultId: action.resultId
        });
    });
}

Testing Strategies¶

1. Query Testing Framework¶

Automated Query Testing¶

// ✅ Good: Implement comprehensive query testing
const queryTestSuite = [
    {
        name: 'basic_text_search',
        query: 'machine learning',
        expectedMinResults: 5,
        maxExecutionTime: 1000
    },
    {
        name: 'boosted_search',
        query: 'artificial^2 intelligence',
        queryType: 'full',
        expectedMinResults: 3,
        maxExecutionTime: 1500
    },
    {
        name: 'fuzzy_search',
        query: 'machne~1 learning',
        queryType: 'full',
        expectedMinResults: 1,
        maxExecutionTime: 2000
    }
];

async function runQueryTests() {
    const results = [];

    for (const test of queryTestSuite) {
        const startTime = Date.now();

        try {
            const searchResults = await searchClient.search(test.query, {
                queryType: test.queryType || 'simple',
                top: 50
            });

            const executionTime = Date.now() - startTime;
            const resultCount = searchResults.count || 0;

            const passed = 
                resultCount >= test.expectedMinResults &&
                executionTime <= test.maxExecutionTime;

            results.push({
                name: test.name,
                passed,
                resultCount,
                executionTime,
                expectedMinResults: test.expectedMinResults,
                maxExecutionTime: test.maxExecutionTime
            });

        } catch (error) {
            results.push({
                name: test.name,
                passed: false,
                error: error.message
            });
        }
    }

    return results;
}

2. A/B Testing for Relevance¶

Relevance Testing Framework¶

// ✅ Good: Test different relevance configurations
async function testRelevanceConfigurations(query, testConfigs) {
    const results = {};

    for (const config of testConfigs) {
        const searchResults = await searchClient.search(query, {
            scoringProfile: config.scoringProfile,
            queryType: config.queryType,
            top: 10
        });

        results[config.name] = {
            results: Array.from(searchResults.results),
            avgScore: calculateAverageScore(searchResults.results),
            topResultScore: getTopResultScore(searchResults.results)
        };
    }

    return results;
}

Module Documentation¶

Prerequisites - Required setup and knowledge
Main Documentation - Complete module overview
Practice & Implementation - Hands-on exercises
Troubleshooting - Common issues and solutions
Code Samples - Working examples in multiple languages

External Resources¶

When You Need Help¶

Query Syntax Issues: Check the Troubleshooting Guide
Performance Problems: Review Performance Analysis Examples
Complex Scenarios: Explore Advanced Query Examples

By following these best practices, you'll create efficient, relevant, and maintainable advanced search experiences that provide excellent user satisfaction and optimal performance.