Practice & Implementation - Advanced Querying¶

Overview¶

This guide provides hands-on exercises and practical implementation scenarios for mastering advanced querying techniques in Azure AI Search. Work through these exercises to build expertise in complex queries, boosting, fuzzy search, and relevance tuning.

Prerequisites¶

Before starting these exercises, ensure you have: - Completed the Prerequisites setup - A working Azure AI Search service with rich sample data - Understanding of basic search concepts from beginner modules - Familiarity with at least one programming language (Python, C#, JavaScript)

Exercise 1: Advanced Query Syntax Mastery¶

Objective¶

Master the full Lucene query syntax for complex search scenarios.

Scenario¶

You're building an advanced search interface for a technical documentation platform. Users need sophisticated query capabilities including field-specific searches, term boosting, and boolean logic.

Tasks¶

Task 1.1: Field-Specific Searches¶

Learn to target specific fields with different importance levels. * Basic Field Targeting*

{
  "search": "title:(machine learning)",
  "queryType": "full"
}

Multi-Field with Boosting

{
  "search": "title:(artificial intelligence)^3 OR description:(artificial intelligence)^2 OR content:(artificial intelligence)",
  "queryType": "full"
}

Implementation Exercise:

// Create a function that builds field-specific queries
function buildFieldSpecificQuery(searchTerm, fieldWeights) {
    const clauses = Object.entries(fieldWeights).map(([field, weight]) => {
        return `${field}:(${searchTerm})${weight > 1 ? '^' + weight : ''}`;
    });

    return clauses.join(' OR ');
}

// Usage
const query = buildFieldSpecificQuery('machine learning', {
    title: 3,
    description: 2,
    content: 1
});

Task 1.2: Complex Boolean Logic¶

Build sophisticated boolean queries with proper precedence.

Nested Boolean Expressions

{
  "search": "(title:(artificial intelligence) OR title:(machine learning)) AND (category:Technology OR category:Science)",
  "queryType": "full"
}

Exclusion with NOT

{
  "search": "(machine learning) AND NOT (category:Beginner)",
  "queryType": "full"
}

Task 1.3: Term and Phrase Boosting¶

Apply strategic boosting to improve relevance.

Term Boosting

{
  "search": "artificial^2 intelligence machine^1.5 learning",
  "queryType": "full"
}

Phrase Boosting

{
  "search": "(\"machine learning\")^3 OR (artificial intelligence)^2",
  "queryType": "full"
}

Expected Outcomes¶

Master full Lucene query syntax
Understand field targeting and boosting strategies
Build complex boolean expressions effectively

Exercise 2: Fuzzy Search and Approximate Matching¶

Objective¶

Implement fuzzy search capabilities to handle typos and approximate matches.

Scenario¶

Users frequently make typos when searching. Implement intelligent fuzzy search that provides relevant results even with spelling errors.

Tasks¶

Task 2.1: Basic Fuzzy Search¶

Implement fuzzy search with appropriate edit distances.

Single Term Fuzzy

{
  "search": "machne~1",
  "queryType": "full"
}

Multi-Term Fuzzy

{
  "search": "machne~1 learing~1",
  "queryType": "full"
}

Implementation:

def build_fuzzy_query(search_terms, edit_distance=1):
    """Build fuzzy query for handling typos"""
    fuzzy_terms = []

    for term in search_terms.split():
        if len(term) > 3:  # Only apply fuzzy to longer terms
            fuzzy_terms.append(f"{term}~{edit_distance}")
        else:
            fuzzy_terms.append(term)

    return " ".join(fuzzy_terms)

# Usage
fuzzy_query = build_fuzzy_query("machne learing", edit_distance=1)

Task 2.2: Hybrid Exact and Fuzzy Search¶

Combine exact matches with fuzzy fallbacks for optimal results.

Boosted Exact with Fuzzy Fallback

{
  "search": "(\"machine learning\")^3 OR (machne~1 learing~1)",
  "queryType": "full"
}

Implementation:

href="#__codelineno-11-1">async function hybridFuzzySearch(searchClient, query) { // Try exact search first const exactResults = await searchClient.search(`"${query}"`, { queryType: 'full', top: 10 }); if (exactResults.count >= 5) { return exactResults; } // Fall back to fuzzy search const fuzzyQuery = query.split(' ') .map(term => term.length > 3 ? `${term}~1` : term) .join(' '); return await searchClient.search(fuzzyQuery, { queryType: 'full', top: 10 }); }

Expected Outcomes¶

Implement effective fuzzy search strategies
Balance exact matches with approximate matching
Handle user input errors gracefully

Exercise 3: Wildcard and Pattern Matching¶

Objective¶

Master wildcard searches for pattern matching and partial term searches.

Scenario¶

Implement search functionality that supports partial matches, prefixes, and pattern-based searches for technical terms and product codes.

Tasks¶

Task 3.1: Prefix and Suffix Wildcards¶

Implement efficient wildcard patterns.

Prefix Wildcards (Most Efficient)

{
  "search": "tech*",
  "queryType": "full"
}

Suffix Wildcards (Use Carefully)

{
  "search": "*ology",
  "queryType": "full"
}

Implementation:

def build_wildcard_query(pattern, field=None):
    """Build wildcard query with field targeting"""
    if field:
        return f"{field}:({pattern})"
    return pattern

# Examples
prefix_query = build_wildcard_query("tech*", "title")
suffix_query = build_wildcard_query("*ology", "category")

Task 3.2: Single Character Wildcards¶

Use ? for single character matching.

Single Character Matching

{
  "search": "colo?r",
  "queryType": "full"
}

Combined Patterns

{
  "search": "tech* AND (colo?r OR behavio?r)",
  "queryType": "full"
}

Expected Outcomes¶

Understand wildcard search performance implications
Implement efficient pattern matching
Combine wildcards with other query features

Exercise 4: Proximity Search and Phrase Matching¶

Objective¶

Implement proximity search to find terms within specified distances.

Scenario¶

Build search functionality that finds related terms within reasonable proximity, useful for finding concepts that are discussed together but not as exact phrases.

Tasks¶

Task 4.1: Basic Proximity Search¶

Find terms within specified word distances.

Proximity Search

{
  "search": "\"machine learning\"~5",
  "queryType": "full"
}

Variable Proximity

{
  "search": "\"artificial intelligence\"~3 OR \"machine learning\"~5",
  "queryType": "full"
}

Task 4.2: Advanced Proximity Patterns¶

Combine proximity with other search features.

Proximity with Boosting

{
  "search": "(\"machine learning\")^3 OR (\"machine learning\"~5)^1.5",
  "queryType": "full"
}

Implementation:

function buildProximityQuery(terms, exactBoost = 3, proximityDistance = 5, proximityBoost = 1.5) {
    const exactPhrase = `"${terms}"`;
    const proximityPhrase = `"${terms}"~${proximityDistance}`;

    return `(${exactPhrase})^${exactBoost} OR (${proximityPhrase})^${proximityBoost}`;
}

// Usage
const query = buildProximityQuery("machine learning", 3, 5, 1.5);

Expected Outcomes¶

Master proximity search techniques
Balance exact phrases with proximity matches
Optimize proximity distances for different content types

Exercise 5: Scoring Profiles and Relevance Tuning¶

Objective¶

Implement and optimize custom scoring profiles for improved relevance.

Scenario¶

Create scoring profiles that boost recent content, popular items, and high-quality documents to improve search relevance for different user scenarios.

Tasks¶

Task 5.1: Field Weight Scoring¶

Configure field weights to prioritize important content areas.

Scoring Profile Configuration

{
  "scoringProfiles": [
    {
      "name": "content-priority",
      "text": {
        "weights": {
          "title": 3.0,
          "description": 2.0,
          "content": 1.0,
          "tags": 1.5
        }
      }
    }
  ]
}

Usage in Queries

{
  "search": "machine learning",
  "scoringProfile": "content-priority"
}

Task 5.2: Function-Based Scoring¶

Implement freshness, magnitude, and distance functions.

Freshness Scoring

{
  "functions": [
    {
      "type": "freshness",
      "fieldName": "publishedDate",
      "boost": 2.0,
      "interpolation": "linear",
      "freshness": {
        "boostingDuration": "P30D"
      }
    }
  ]
}

Magnitude Scoring

{
  "functions": [
    {
      "type": "magnitude",
      "fieldName": "viewCount",
      "boost": 1.5,
      "interpolation": "logarithmic",
      "magnitude": {
        "boostingRangeStart": 100,
        "boostingRangeEnd": 10000,
        "constantBoostBeyondRange": false
      }
    }
  ]
}

Task 5.3: Combined Scoring Strategies¶

Create comprehensive scoring profiles that combine multiple factors.

Implementation:

def create_comprehensive_scoring_profile():
    return {
        "name": "comprehensive-relevance",
        "text": {
            "weights": {
                "title": 2.5,
                "description": 1.8,
                "content": 1.0,
                "tags": 1.3,
                "author": 0.8
            }
        },
        "functions": [
            {
                "type": "freshness",
                "fieldName": "publishedDate",
                "boost": 1.8,
                "interpolation": "linear",
                "freshness": {
                    "boostingDuration": "P60D"
                }
            },
            {
                "type": "magnitude",
                "fieldName": "rating",
                "boost": 1.5,
                "interpolation": "linear",
                "magnitude": {
                    "boostingRangeStart": 3.0,
                    "boostingRangeEnd": 5.0,
                    "constantBoostBeyondRange": true
                }
            }
        ],
        "functionAggregation": "sum"
    }

Expected Outcomes¶

Design effective scoring profiles
Understand different scoring function types
Balance multiple relevance factors

Exercise 6: Search Suggestions and Autocomplete¶

Objective¶

Implement intelligent search suggestions and autocomplete functionality.

Scenario¶

Build a responsive search experience with real-time suggestions that help users discover content and correct their queries.

Tasks¶

Task 6.1: Basic Suggestions¶

Implement suggester-based autocomplete.

Suggester Configuration

{
  "suggesters": [
    {
      "name": "content-suggester",
      "searchMode": "analyzingInfixMatching",
      "sourceFields": ["title", "description", "tags", "category"]
    }
  ]
}

Suggestion Queries

{
  "suggesterName": "content-suggester",
  "search": "mach",
  "top": 8
}

Task 6.2: Intelligent Suggestion Logic¶

Build smart suggestion systems with fallbacks.

Implementation:

async function getIntelligentSuggestions(searchClient, partialText, maxSuggestions = 8) {
    const suggestions = [];

    try {
        // Get direct suggestions
        const directSuggestions = await searchClient.suggest(partialText, 'content-suggester', {
            top: maxSuggestions
        });

        for await (const suggestion of directSuggestions.results) {
            suggestions.push({
                text: suggestion['@search.text'],
                type: 'direct',
                document: suggestion.document
            });
        }

        // If we don't have enough suggestions, try fuzzy search
        if (suggestions.length < maxSuggestions / 2) {
            const fuzzyQuery = `${partialText}~1`;
            const fuzzyResults = await searchClient.search(fuzzyQuery, {
                queryType: 'full',
                top: maxSuggestions - suggestions.length,
                select: 'title,category'
            });

            for await (const result of fuzzyResults.results) {
                suggestions.push({
                    text: result.document.title,
                    type: 'fuzzy',
                    document: result.document
                });
            }
        }

        return suggestions;

    } catch (error) {
        console.error('Suggestion error:', error);
        return [];
    }
}

Expected Outcomes¶

Implement effective suggestion systems
Create intelligent fallback mechanisms
Build responsive autocomplete experiences

Exercise 7: Performance Optimization¶

Objective¶

Optimize advanced queries for production performance.

Scenario¶

Your advanced search features are working but need optimization for high-traffic production use.

Tasks¶

Task 7.1: Query Performance Analysis¶

Measure and analyze query performance.

Performance Monitoring

async function monitorQueryPerformance(searchClient, queries) {
    const results = [];

    for (const query of queries) {
        const startTime = Date.now();

        try {
            const searchResults = await searchClient.search(query.search, {
                queryType: query.queryType || 'simple',
                top: query.top || 20,
                scoringProfile: query.scoringProfile
            });

            const endTime = Date.now();
            const resultCount = searchResults.count || 0;

            results.push({
                query: query.search,
                executionTime: endTime - startTime,
                resultCount,
                queryType: query.queryType,
                scoringProfile: query.scoringProfile
            });

        } catch (error) {
            results.push({
                query: query.search,
                error: error.message,
                executionTime: Date.now() - startTime
            });
        }
    }

    return results;
}

Task 7.2: Caching Strategies¶

Implement intelligent caching for common queries.

Query Result Caching

class QueryCache {
    constructor(maxSize = 1000, ttlMinutes = 30) {
        this.cache = new Map();
        this.maxSize = maxSize;
        this.ttl = ttlMinutes * 60 * 1000;
    }

    generateKey(query, options) {
        return JSON.stringify({ query, options });
    }

    get(query, options) {
        const key = this.generateKey(query, options);
        const cached = this.cache.get(key);

        if (cached && Date.now() - cached.timestamp < this.ttl) {
            return cached.results;
        }

        if (cached) {
            this.cache.delete(key);
        }

        return null;
    }

    set(query, options, results) {
        const key = this.generateKey(query, options);

        if (this.cache.size >= this.maxSize) {
            const firstKey = this.cache.keys().next().value;
            this.cache.delete(firstKey);
        }

        this.cache.set(key, {
            results,
            timestamp: Date.now()
        });
    }
}

Expected Outcomes¶

Understand query performance characteristics
Implement effective caching strategies
Optimize queries for production use

Completion Checklist¶

After completing these exercises, you should be able to:

[ ] Build complex queries using full Lucene syntax
[ ] Implement effective fuzzy search with appropriate edit distances
[ ] Use wildcard patterns efficiently for partial matching
[ ] Apply proximity search for finding related terms
[ ] Design and implement custom scoring profiles
[ ] Create intelligent suggestion and autocomplete systems
[ ] Optimize advanced queries for performance
[ ] Monitor and analyze query performance metrics
[ ] Implement caching strategies for common queries
[ ] Handle edge cases and error scenarios gracefully

Next Steps¶

Apply to Your Project: Implement these advanced querying techniques in your search application
Experiment with Combinations: Try combining different advanced features for unique search experiences
Performance Testing: Conduct thorough performance testing with realistic data volumes
User Testing: Validate that advanced features improve user search experience
Move to Next Module: Progress to Module 10 (Analyzers and Custom Scoring) for deeper relevance control

Additional Resources¶

Module Documentation¶

Prerequisites - Required setup and knowledge
Main Documentation - Complete module overview
Best Practices - Guidelines for effective implementation
Troubleshooting - Common issues and solutions
Code Samples - Working examples in multiple languages

External Resources¶

When You Need Help¶

Query Syntax Issues: Check the Troubleshooting Guide
Performance Problems: Review Performance Optimization Examples
Complex Scenarios: Explore Advanced Query Examples

Remember: Advanced querying is about finding the right balance between search power and performance. Start simple and add complexity as needed based on your specific use cases and user requirements.