Best Practices - Advanced Querying¶
Overview¶
This guide provides best practices for implementing advanced querying techniques in Azure AI Search. Following these guidelines will help you create high-performance, relevant, and maintainable search experiences.
Query Construction Best Practices¶
1. Choose the Right Query Type¶
Simple Query Syntax¶
// ✅ Good: Use simple syntax for basic searches
{
"search": "machine learning",
"queryType": "simple",
"searchMode": "any"
}
When to use: - Basic text search with AND/OR/NOT operators - User-facing search boxes where syntax errors should be avoided - Simple wildcard searches with * and ?
Full Lucene Query Syntax¶
// ✅ Good: Use full Lucene for advanced features
{
"search": "title:(artificial intelligence) AND content:(machine learning)^2",
"queryType": "full"
}
When to use: - Field-specific searches - Term and phrase boosting - Fuzzy search and proximity search - Complex boolean logic with parentheses
2. Effective Boosting Strategies¶
Term Boosting¶
// ✅ Good: Boost important terms appropriately
{
"search": "artificial^2 intelligence machine^1.5 learning",
"queryType": "full"
}
// ❌ Avoid: Excessive boosting that skews results
{
"search": "artificial^10 intelligence^8 machine^5 learning",
"queryType": "full"
}
Field Boosting¶
// ✅ Good: Prioritize title matches over content
{
"search": "title:(machine learning)^3 OR content:(machine learning)",
"queryType": "full"
}
Scoring Profile Usage¶
// ✅ Good: Use scoring profiles for consistent relevance
{
"search": "machine learning",
"scoringProfile": "boost-recent-popular"
}
3. Fuzzy Search Optimization¶
Appropriate Edit Distance¶
// ✅ Good: Use edit distance 1 for most cases
{
"search": "machne~1 learning",
"queryType": "full"
}
// ❌ Avoid: High edit distance that returns irrelevant results
{
"search": "machne~3 learning",
"queryType": "full"
}
Combine with Other Operators¶
// ✅ Good: Combine fuzzy search with exact matches
{
"search": "(machine learning) OR (machne~1 learing~1)",
"queryType": "full"
}
4. Wildcard Search Guidelines¶
Efficient Wildcard Patterns¶
// ✅ Good: Suffix wildcards are most efficient
{
"search": "tech*",
"queryType": "full"
}
// ⚠️ Caution: Prefix wildcards can be slower
{
"search": "*ology",
"queryType": "full"
}
// ❌ Avoid: Leading wildcards on short terms
{
"search": "*ai*",
"queryType": "full"
}
Combine with Field Restrictions¶
// ✅ Good: Limit wildcard searches to specific fields
{
"search": "title:tech* OR category:tech*",
"queryType": "full"
}
5. Proximity Search Best Practices¶
Reasonable Proximity Distance¶
// ✅ Good: Use appropriate proximity distance
{
"search": "\"machine learning\"~5",
"queryType": "full"
}
// ❌ Avoid: Excessive proximity distance
{
"search": "\"machine learning\"~50",
"queryType": "full"
}
Combine with Exact Phrases¶
// ✅ Good: Boost exact phrases over proximity matches
{
"search": "(\"machine learning\")^2 OR (\"machine learning\"~3)",
"queryType": "full"
}
Relevance Tuning Best Practices¶
1. Scoring Profile Design¶
Balanced Weight Distribution¶
// ✅ Good: Balanced field weights
{
"scoringProfiles": [
{
"name": "balanced-relevance",
"text": {
"weights": {
"title": 2.0,
"description": 1.5,
"content": 1.0,
"tags": 1.2
}
}
}
]
}
// ❌ Avoid: Extreme weight differences
{
"text": {
"weights": {
"title": 10.0,
"content": 0.1
}
}
}
Effective Function Scoring¶
// ✅ Good: Reasonable freshness boosting
{
"functions": [
{
"type": "freshness",
"fieldName": "publishedDate",
"boost": 1.5,
"interpolation": "linear",
"freshness": {
"boostingDuration": "P30D"
}
}
]
}
2. Multi-Field Search Strategies¶
Prioritized Field Searching¶
// ✅ Good: Search most important fields first
{
"search": "title:(artificial intelligence)^3 OR description:(artificial intelligence)^2 OR content:(artificial intelligence)",
"queryType": "full"
}
Cross-Field Boosting¶
// ✅ Good: Boost documents where terms appear in multiple fields
{
"search": "(title:machine AND description:learning)^2 OR (machine learning)",
"queryType": "full"
}
Performance Optimization¶
1. Query Efficiency¶
Limit Search Scope¶
// ✅ Good: Use searchFields to limit scope
{
"search": "machine learning",
"searchFields": "title,description,content"
}
// ✅ Good: Combine with filters for better performance
{
"search": "machine learning",
"filter": "category eq 'Technology' and publishedDate ge 2023-01-01T00:00:00Z"
}
Optimize Result Sets¶
// ✅ Good: Request only needed fields
{
"search": "machine learning",
"select": "id,title,description,rating",
"top": 20
}
2. Caching Strategies¶
Cache Common Queries¶
// ✅ Good: Cache frequent search patterns
const queryCache = new Map();
const cacheKey = `${searchText}_${filters}_${orderBy}`;
if (queryCache.has(cacheKey)) {
return queryCache.get(cacheKey);
}
const results = await searchClient.search(searchText, options);
queryCache.set(cacheKey, results);
Cache Suggestions¶
// ✅ Good: Cache suggestion results
const suggestionCache = new Map();
const suggestionKey = `suggest_${partialText}`;
if (suggestionCache.has(suggestionKey)) {
return suggestionCache.get(suggestionKey);
}
3. Index Design for Advanced Queries¶
Appropriate Analyzers¶
// ✅ Good: Use language-specific analyzers
{
"name": "title",
"type": "Edm.String",
"searchable": true,
"analyzer": "en.microsoft"
}
// ✅ Good: Use keyword analyzer for exact matching
{
"name": "productCode",
"type": "Edm.String",
"searchable": true,
"analyzer": "keyword"
}
Strategic Field Configuration¶
// ✅ Good: Configure fields based on usage patterns
{
"name": "content",
"type": "Edm.String",
"searchable": true,
"retrievable": false, // Don't return large content in results
"analyzer": "en.microsoft"
}
Search Experience Best Practices¶
1. Suggestion Implementation¶
Effective Suggester Configuration¶
// ✅ Good: Include relevant fields in suggester
{
"suggesters": [
{
"name": "product-suggester",
"searchMode": "analyzingInfixMatching",
"sourceFields": ["title", "category", "brand", "tags"]
}
]
}
Smart Suggestion Logic¶
// ✅ Good: Implement intelligent suggestion fallback
async function getSmartSuggestions(partialText) {
// Try exact suggestions first
let suggestions = await getSuggestions(partialText);
if (suggestions.length < 3) {
// Fall back to fuzzy suggestions
const fuzzyQuery = `${partialText}~1`;
const fuzzyResults = await searchClient.search(fuzzyQuery, {
queryType: "full",
top: 5,
select: "title"
});
// Add fuzzy results to suggestions
for await (const result of fuzzyResults.results) {
suggestions.push({
text: result.document.title,
queryPlusText: result.document.title
});
}
}
return suggestions;
}
2. Hit Highlighting¶
Effective Highlighting Configuration¶
// ✅ Good: Highlight relevant fields with appropriate tags
{
"search": "machine learning",
"highlight": "title,description,content",
"highlightPreTag": "<mark>",
"highlightPostTag": "</mark>",
"top": 10
}
Smart Highlighting Display¶
// ✅ Good: Prioritize highlighted fields in display
function formatHighlights(result) {
const highlights = result['@search.highlights'];
if (highlights) {
// Prioritize title highlights
if (highlights.title && highlights.title.length > 0) {
return highlights.title[0];
}
// Fall back to description highlights
if (highlights.description && highlights.description.length > 0) {
return highlights.description[0];
}
// Finally use content highlights
if (highlights.content && highlights.content.length > 0) {
return highlights.content[0];
}
}
// Return original text if no highlights
return result.document.title || result.document.description;
}
3. Query Expansion and Enhancement¶
Automatic Query Enhancement¶
// ✅ Good: Implement intelligent query expansion
function enhanceQuery(originalQuery) {
const synonyms = {
'AI': ['artificial intelligence', 'machine learning'],
'ML': ['machine learning', 'artificial intelligence'],
'tech': ['technology', 'technical']
};
let enhancedQuery = originalQuery;
// Add synonyms with lower boost
Object.entries(synonyms).forEach(([term, syns]) => {
if (originalQuery.toLowerCase().includes(term.toLowerCase())) {
const synonymQuery = syns.map(syn => `(${syn})`).join(' OR ');
enhancedQuery += ` OR (${synonymQuery})^0.5`;
}
});
return enhancedQuery;
}
Error Handling and Resilience¶
1. Query Validation¶
Input Sanitization¶
// ✅ Good: Sanitize user input
function sanitizeQuery(userInput) {
// Remove potentially problematic characters
let sanitized = userInput.replace(/[<>]/g, '');
// Escape special Lucene characters if using full syntax
if (queryType === 'full') {
sanitized = sanitized.replace(/([+\-&|!(){}[\]^"~*?:\\])/g, '\\$1');
}
return sanitized;
}
Query Complexity Limits¶
// ✅ Good: Limit query complexity
function validateQueryComplexity(query) {
const maxLength = 1000;
const maxClauses = 50;
if (query.length > maxLength) {
throw new Error('Query too long');
}
const clauseCount = (query.match(/AND|OR/gi) || []).length + 1;
if (clauseCount > maxClauses) {
throw new Error('Query too complex');
}
return true;
}
2. Graceful Degradation¶
Fallback Query Strategies¶
// ✅ Good: Implement query fallback
async function robustSearch(query, options) {
try {
// Try advanced query first
return await searchClient.search(query, {
...options,
queryType: 'full'
});
} catch (error) {
if (error.message.includes('syntax')) {
// Fall back to simple query
console.warn('Advanced query failed, falling back to simple syntax');
return await searchClient.search(query, {
...options,
queryType: 'simple'
});
}
throw error;
}
}
Monitoring and Analytics¶
1. Query Performance Tracking¶
Performance Metrics Collection¶
// ✅ Good: Track query performance
async function monitoredSearch(query, options) {
const startTime = Date.now();
const queryHash = hashQuery(query, options);
try {
const results = await searchClient.search(query, options);
const duration = Date.now() - startTime;
// Log performance metrics
logMetrics({
queryHash,
duration,
resultCount: results.count,
query: query.substring(0, 100), // Truncate for privacy
success: true
});
return results;
} catch (error) {
const duration = Date.now() - startTime;
logMetrics({
queryHash,
duration,
error: error.message,
success: false
});
throw error;
}
}
2. Search Analytics¶
User Behavior Tracking¶
// ✅ Good: Track search patterns
function trackSearchBehavior(query, results, userActions) {
analytics.track('search_performed', {
query: hashQuery(query), // Hash for privacy
resultCount: results.count,
hasResults: results.count > 0,
queryType: results.queryType,
timestamp: new Date().toISOString()
});
// Track user interactions with results
userActions.forEach(action => {
analytics.track('search_result_interaction', {
queryHash: hashQuery(query),
action: action.type, // click, view, etc.
resultPosition: action.position,
resultId: action.resultId
});
});
}
Testing Strategies¶
1. Query Testing Framework¶
Automated Query Testing¶
// ✅ Good: Implement comprehensive query testing
const queryTestSuite = [
{
name: 'basic_text_search',
query: 'machine learning',
expectedMinResults: 5,
maxExecutionTime: 1000
},
{
name: 'boosted_search',
query: 'artificial^2 intelligence',
queryType: 'full',
expectedMinResults: 3,
maxExecutionTime: 1500
},
{
name: 'fuzzy_search',
query: 'machne~1 learning',
queryType: 'full',
expectedMinResults: 1,
maxExecutionTime: 2000
}
];
async function runQueryTests() {
const results = [];
for (const test of queryTestSuite) {
const startTime = Date.now();
try {
const searchResults = await searchClient.search(test.query, {
queryType: test.queryType || 'simple',
top: 50
});
const executionTime = Date.now() - startTime;
const resultCount = searchResults.count || 0;
const passed =
resultCount >= test.expectedMinResults &&
executionTime <= test.maxExecutionTime;
results.push({
name: test.name,
passed,
resultCount,
executionTime,
expectedMinResults: test.expectedMinResults,
maxExecutionTime: test.maxExecutionTime
});
} catch (error) {
results.push({
name: test.name,
passed: false,
error: error.message
});
}
}
return results;
}
2. A/B Testing for Relevance¶
Relevance Testing Framework¶
// ✅ Good: Test different relevance configurations
async function testRelevanceConfigurations(query, testConfigs) {
const results = {};
for (const config of testConfigs) {
const searchResults = await searchClient.search(query, {
scoringProfile: config.scoringProfile,
queryType: config.queryType,
top: 10
});
results[config.name] = {
results: Array.from(searchResults.results),
avgScore: calculateAverageScore(searchResults.results),
topResultScore: getTopResultScore(searchResults.results)
};
}
return results;
}
Related Resources¶
Module Documentation¶
- Prerequisites - Required setup and knowledge
- Main Documentation - Complete module overview
- Practice & Implementation - Hands-on exercises
- Troubleshooting - Common issues and solutions
- Code Samples - Working examples in multiple languages
External Resources¶
When You Need Help¶
- Query Syntax Issues: Check the Troubleshooting Guide
- Performance Problems: Review Performance Analysis Examples
- Complex Scenarios: Explore Advanced Query Examples
By following these best practices, you'll create efficient, relevant, and maintainable advanced search experiences that provide excellent user satisfaction and optimal performance.