Skip to content

Module 10: Analyzers & Scoring - Code Samples

This directory contains comprehensive code samples demonstrating text analysis and scoring techniques in Azure AI Search.

Overview

These code samples cover:

  • Built-in Analyzer Usage: Working with language-specific and specialized analyzers
  • Custom Analyzer Creation: Building analyzers with tokenizers, filters, and character filters
  • Scoring Profile Implementation: Creating and testing custom scoring algorithms
  • Performance Testing: Measuring and optimizing analyzer and scoring performance
  • Advanced Techniques: N-gram analyzers, phonetic matching, and multi-language support

Sample Categories

1. Analyzer Configuration and Testing

  • 01_builtin_analyzers: Compare and test built-in analyzers
  • 02_custom_analyzers: Create and configure custom text analysis pipelines
  • 03_analyzer_testing: Comprehensive testing and validation frameworks
  • 04_ngram_autocomplete: N-gram analyzers for autocomplete functionality

2. Scoring Profile Implementation

  • 05_basic_scoring: Field weights and basic scoring profiles
  • 06_advanced_scoring: Complex scoring with multiple functions
  • 07_location_scoring: Geographic distance-based scoring
  • 08_performance_optimization: Scoring profile performance tuning

Language Support

Code samples are provided in multiple programming languages:

  • Python: Using azure-search-documents SDK
  • JavaScript/Node.js: Using @azure/search-documents SDK
  • C#: Using Azure.Search.Documents NuGet package
  • REST API: Direct HTTP requests with curl and HTTP files

Prerequisites

Before running these samples:

  1. Azure AI Search Service: Standard tier or higher for custom analyzers
  2. Admin API Key: Required for creating indexes and analyzers
  3. Development Environment: Appropriate SDK installed for your language
  4. Sample Data: Test documents for analyzer and scoring validation

Quick Start

Python Setup

pip install azure-search-documents azure-identity

JavaScript Setup

npm install @azure/search-documents @azure/identity

C# Setup

dotnet add package Azure.Search.Documents

Configuration

Create a configuration file with your Azure AI Search service details:

Python (config.py)

SEARCH_SERVICE_NAME = "your-search-service"
SEARCH_ADMIN_KEY = "your-admin-key"
SEARCH_QUERY_KEY = "your-query-key"
SEARCH_INDEX_NAME = "analyzer-test-index"

JavaScript (config.js)

module.exports = {
    searchServiceName: "your-search-service",
    adminKey: "your-admin-key",
    queryKey: "your-query-key",
    indexName: "analyzer-test-index"
};

C# (appsettings.json)

{
  "SearchServiceName": "your-search-service",
  "SearchAdminKey": "your-admin-key",
  "SearchQueryKey": "your-query-key",
  "SearchIndexName": "analyzer-test-index"
}

Sample Structure

Each code sample includes:

  • Main Implementation: Core functionality demonstration
  • Configuration: Index schema and analyzer definitions
  • Test Data: Sample documents for testing
  • Validation: Methods to verify expected behavior
  • Documentation: Inline comments and explanations

Running the Samples

Individual Samples

Each sample can be run independently:

# Python
python 01_builtin_analyzers.py

# JavaScript
node 01_builtin_analyzers.js

# C#
dotnet run 01_BuiltinAnalyzers.cs

Complete Test Suite

Run all samples in sequence:

# Python
python run_all_samples.py

# JavaScript
npm run test-all

# C#
dotnet test

Sample Descriptions

01_builtin_analyzers

  • Compare standard, English, keyword, and simple analyzers
  • Test tokenization differences with various text inputs
  • Demonstrate language-specific analyzer behavior

02_custom_analyzers

  • Create custom analyzers with character filters, tokenizers, and token filters
  • Implement domain-specific text processing
  • Test HTML stripping, synonym mapping, and stop word removal

03_analyzer_testing

  • Comprehensive analyzer testing framework
  • Automated validation of tokenization results
  • Performance benchmarking and comparison tools

04_ngram_autocomplete

  • Edge n-gram tokenizer for autocomplete functionality
  • Separate index and search analyzers
  • Autocomplete query implementation and testing

05_basic_scoring

  • Field weight configuration and testing
  • Simple scoring profile implementation
  • Result ranking comparison with and without scoring

06_advanced_scoring

  • Multiple scoring functions (freshness, magnitude, distance)
  • Function aggregation strategies
  • Complex business logic implementation

07_location_scoring

  • Geographic distance-based scoring
  • Location parameter handling
  • Restaurant/business finder implementation

08_performance_optimization

  • Performance measurement and monitoring
  • Analyzer and scoring profile optimization
  • A/B testing framework for configuration comparison

Best Practices Demonstrated

Analyzer Design

  • Start with built-in analyzers before creating custom ones
  • Use appropriate analyzers for different field types
  • Test thoroughly with representative data
  • Monitor performance impact

Scoring Profile Design

  • Balance field weights appropriately
  • Use scoring functions judiciously
  • Test with real user queries
  • Monitor relevance metrics

Performance Optimization

  • Measure baseline performance
  • Use separate index/search analyzers when beneficial
  • Apply complex analysis selectively
  • Implement caching strategies

Troubleshooting

Common Issues

  1. Analyzer Not Found: Ensure analyzer is defined before field reference
  2. Invalid Tokens: Use Analyze API to debug tokenization
  3. Poor Performance: Simplify analyzers or use selective application
  4. Scoring Not Applied: Verify scoring profile parameter in queries

Debugging Tools

  • Analyze API: Test text processing step by step
  • Performance Monitoring: Measure indexing and query performance
  • Validation Scripts: Automated testing of expected behavior
  • Logging: Detailed operation logging for troubleshooting

Additional Resources

Contributing

When adding new samples:

  1. Follow the established naming convention
  2. Include comprehensive documentation
  3. Add validation and error handling
  4. Test with multiple data scenarios
  5. Update this README with sample descriptions

These code samples provide practical, hands-on experience with text analysis and scoring in Azure AI Search, demonstrating real-world implementation patterns and best practices.