Query Operations Troubleshooting - Module 4: Simple Queries¶
Common Query Issues¶
Issue: No results returned for valid queries¶
Symptoms: - Query executes successfully but returns zero results - Expected documents are not found - Similar queries work in other contexts
Common Causes: - Field not marked as searchable in index schema - Incorrect field names in searchFields parameter - Analyzer mismatch between indexing and querying - Case sensitivity issues - Special characters not properly escaped
Solutions: 1. Verify Field Configuration:
Check that fields are marked as"searchable": true
-
Test with Wildcard Search:
-
Verify Field Names:
Issue: Query syntax errors¶
Symptoms: - HTTP 400 Bad Request errors - "Invalid query syntax" error messages - Queries fail to execute
Common Causes: - Malformed boolean expressions - Unescaped special characters - Invalid field references - Incorrect query type specification
Solutions: 1. Validate Boolean Syntax:
// Invalid
{
"search": "hotel AND (luxury OR", // Missing closing parenthesis
"queryType": "full"
}
// Valid
{
"search": "hotel AND (luxury OR premium)",
"queryType": "full"
}
-
Escape Special Characters:
-
Test with Simple Syntax First:
Issue: Poor search relevance¶
Symptoms: - Irrelevant results appearing first - Expected results ranked too low - Inconsistent ranking across similar queries
Common Causes: - Inappropriate search mode selection - Missing or incorrect field boosting - Analyzer configuration issues - Poor query construction
Solutions: 1. Adjust Search Mode:
// For broader matching
{
"search": "luxury hotel",
"searchMode": "any" // Matches documents with "luxury" OR "hotel"
}
// For more precise matching
{
"search": "luxury hotel",
"searchMode": "all" // Matches documents with "luxury" AND "hotel"
}
-
Implement Field Boosting:
-
Use Scoring Profiles:
Performance Issues¶
Issue: Slow query response times¶
Symptoms: - Queries take longer than expected to execute - Timeout errors on complex queries - Poor user experience due to delays
Common Causes: - Complex query expressions - Large result sets without pagination - Inefficient field selection - Resource constraints on search service
Solutions: 1. Optimize Query Complexity:
// Avoid overly complex boolean expressions
// Instead of:
{
"search": "(title:(luxury OR premium OR deluxe) AND description:(hotel OR resort OR spa)) OR (tags:(5-star OR luxury) AND category:(accommodation OR lodging))",
"queryType": "full"
}
// Use simpler approach:
{
"search": "luxury hotel",
"searchFields": "title,description,tags",
"searchMode": "any"
}
-
Implement Pagination:
-
Optimize Field Selection:
Issue: High resource utilization¶
Symptoms: - Search service showing high CPU or memory usage - Throttling or rate limiting errors - Degraded performance across all queries
Common Causes: - Too many concurrent queries - Inefficient query patterns - Large index size relative to service tier - Wildcard queries with leading wildcards
Solutions: 1. Optimize Wildcard Usage:
// Avoid leading wildcards (slower)
{
"search": "*otel",
"queryType": "full"
}
// Prefer trailing wildcards (faster)
{
"search": "hot*",
"queryType": "full"
}
-
Implement Query Caching:
-
Monitor and Scale Service:
- Monitor search service metrics
- Consider upgrading service tier
- Implement connection pooling
- Use appropriate replica and partition configuration
Field-Specific Issues¶
Issue: Multi-field search not working as expected¶
Symptoms: - Results missing from specific fields - Inconsistent behavior across fields - Field-specific queries failing
Common Causes: - Fields not marked as searchable - Different analyzers on different fields - Field name typos in searchFields parameter - Complex field structures not handled properly
Solutions: 1. Verify Field Configuration:
Check each field'ssearchable attribute and analyzer configuration.
-
Test Individual Fields:
-
Handle Complex Fields:
Issue: Analyzer-related search problems¶
Symptoms: - Expected matches not found - Language-specific searches failing - Inconsistent tokenization behavior
Common Causes: - Wrong analyzer for content language - Mismatch between index-time and query-time analyzers - Custom analyzer configuration issues - Case sensitivity problems
Solutions: 1. Test Analyzer Behavior:
POST https://[service-name].search.windows.net/indexes/[index-name]/analyze?api-version=2024-07-01
{
"text": "Luxury Hotel in Seattle",
"analyzer": "en.lucene"
}
-
Verify Analyzer Configuration:
-
Test with Different Analyzers:
Boolean Logic Issues¶
Issue: Boolean queries not working as expected¶
Symptoms: - AND/OR logic not behaving correctly - Unexpected results with boolean combinations - Precedence issues in complex expressions
Common Causes: - Incorrect operator precedence - Missing parentheses for grouping - Wrong query type (simple vs full) - Escaped operators in simple syntax
Solutions: 1. Use Proper Query Type:
// For boolean operators, use full Lucene syntax
{
"search": "hotel AND (luxury OR premium)",
"queryType": "full" // Required for boolean operators
}
-
Add Explicit Grouping:
-
Test Boolean Logic Step by Step:
// Test individual parts first const tests = [ "hotel", "luxury", "hotel AND luxury", "hotel OR resort", "(hotel OR resort) AND luxury" ]; for (const query of tests) { const results = await searchClient.search({ search: query, queryType: "full" }); console.log(`Query "${query}": ${results.length} results`); }
Diagnostic Techniques¶
Query Analysis¶
async function analyzeQuery(query) {
console.log(`Analyzing query: ${JSON.stringify(query)}`);
try {
const startTime = Date.now();
const results = await searchClient.search(query);
const duration = Date.now() - startTime;
console.log(`Results: ${results.length} documents in ${duration}ms`);
// Log first few results for analysis
const firstResults = results.slice(0, 3);
firstResults.forEach((result, index) => {
console.log(`Result ${index + 1}:`, {
id: result.id,
score: result['@search.score'],
highlights: result['@search.highlights']
});
});
return results;
} catch (error) {
console.error(`Query failed:`, error.message);
throw error;
}
}
Performance Profiling¶
class QueryProfiler {
constructor() {
this.metrics = [];
}
async profileQuery(query, description = '') {
const startTime = performance.now();
const startMemory = process.memoryUsage().heapUsed;
try {
const results = await searchClient.search(query);
const endTime = performance.now();
const endMemory = process.memoryUsage().heapUsed;
const metrics = {
description,
query: JSON.stringify(query),
duration: endTime - startTime,
memoryDelta: endMemory - startMemory,
resultCount: results.length,
success: true
};
this.metrics.push(metrics);
return results;
} catch (error) {
const endTime = performance.now();
const metrics = {
description,
query: JSON.stringify(query),
duration: endTime - startTime,
error: error.message,
success: false
};
this.metrics.push(metrics);
throw error;
}
}
getReport() {
return {
totalQueries: this.metrics.length,
successRate: this.metrics.filter(m => m.success).length / this.metrics.length,
averageDuration: this.metrics.reduce((sum, m) => sum + m.duration, 0) / this.metrics.length,
slowestQuery: this.metrics.reduce((max, m) => m.duration > max.duration ? m : max),
metrics: this.metrics
};
}
}
Error Pattern Analysis¶
function analyzeSearchErrors(errors) {
const errorPatterns = {};
errors.forEach(error => {
const pattern = error.message.replace(/['"]\w+['"]/g, '"FIELD"')
.replace(/\d+/g, 'NUMBER');
if (!errorPatterns[pattern]) {
errorPatterns[pattern] = {
count: 0,
examples: []
};
}
errorPatterns[pattern].count++;
if (errorPatterns[pattern].examples.length < 3) {
errorPatterns[pattern].examples.push(error.message);
}
});
return errorPatterns;
}
Prevention Strategies¶
Input Validation¶
function validateSearchQuery(query) {
const errors = [];
// Check for empty query
if (!query.search || query.search.trim().length === 0) {
errors.push("Search query cannot be empty");
}
// Check query length
if (query.search && query.search.length > 1000) {
errors.push("Search query too long (max 1000 characters)");
}
// Validate searchFields
if (query.searchFields) {
const validFields = ['title', 'description', 'tags', 'content'];
const requestedFields = query.searchFields.split(',');
const invalidFields = requestedFields.filter(f => !validFields.includes(f.trim()));
if (invalidFields.length > 0) {
errors.push(`Invalid search fields: ${invalidFields.join(', ')}`);
}
}
// Validate top parameter
if (query.top && (query.top < 1 || query.top > 1000)) {
errors.push("Top parameter must be between 1 and 1000");
}
return errors;
}
Query Sanitization¶
function sanitizeQuery(query) {
// Remove potentially harmful characters
let sanitized = query.replace(/[<>]/g, '');
// Escape special Lucene characters if using full syntax
if (query.queryType === 'full') {
sanitized = sanitized.replace(/[+\-&|!(){}[\]^"~*?:\\\/]/g, '\\$&');
}
return sanitized;
}
Monitoring and Alerting¶
class QueryMonitor {
constructor() {
this.errorThreshold = 0.05; // 5% error rate threshold
this.slowQueryThreshold = 2000; // 2 second threshold
this.recentQueries = [];
this.maxHistorySize = 1000;
}
recordQuery(query, duration, success, error = null) {
const record = {
timestamp: Date.now(),
query: JSON.stringify(query),
duration,
success,
error
};
this.recentQueries.push(record);
// Keep only recent queries
if (this.recentQueries.length > this.maxHistorySize) {
this.recentQueries.shift();
}
// Check for alerts
this.checkAlerts();
}
checkAlerts() {
const recent = this.recentQueries.slice(-100); // Last 100 queries
// Check error rate
const errorRate = recent.filter(q => !q.success).length / recent.length;
if (errorRate > this.errorThreshold) {
this.alert(`High error rate: ${(errorRate * 100).toFixed(1)}%`);
}
// Check for slow queries
const slowQueries = recent.filter(q => q.duration > this.slowQueryThreshold);
if (slowQueries.length > 0) {
this.alert(`${slowQueries.length} slow queries detected`);
}
}
alert(message) {
console.warn(`QUERY ALERT: ${message}`);
// Implement your alerting mechanism here
}
}
By following these troubleshooting guidelines and implementing proper monitoring, you can identify and resolve query issues quickly, ensuring reliable and performant search functionality in your Azure AI Search implementation.