← Back to documentation

database-performance

Database Query Performance Analysis Guide

Overview

This document describes the slow query logging and analysis infrastructure for BuyWhere API.

Slow Query Logging

PostgreSQL slow query logging is enabled via migration 038_slow_query_logging_and_indexes:

  • log_min_duration_statement = '100ms' - Logs statements taking >100ms
  • pg_stat_statements extension enabled for query performance statistics

Analyzing Slow Queries

View Slow Query Logs

# Connect to PostgreSQL
psql $DATABASE_URL

# View recent slow queries from PostgreSQL logs
# (Logs go to stderr / container logs by default)

# Use pg_stat_statements to find expensive queries
SELECT query, calls, mean_time, total_time, rows
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 20;

Run the Slow Query Benchmark

cd /home/paperclip/buywhere-api
python slow_query_benchmark.py

This runs a suite of test queries and generates /home/paperclip/buywhere-api/slow_query_report.json with results.

Indexes Added

Migration 038_slow_query_logging_and_indexes adds these indexes:

Index NameColumnsUse Case
idx_products_active_in_stock(is_active, in_stock)Filter by active and availability
idx_products_active_source_in_stock(is_active, source, in_stock)Filter by source with availability
idx_products_active_source_price_updated(is_active, source, price, updated_at)Filtered search with price ordering
idx_products_active_brand_price(is_active, brand, price)Brand-filtered search with price
idx_products_active_category_price(is_active, category, price)Category-filtered search with price
idx_products_source_updated(source, updated_at)Source-specific recent products

Common Query Patterns

Search Query Pattern

# Common search pattern in app/routers/search.py
select(Product).where(
    Product.is_active == True,
    search_vector @@ websearch_to_tsquery('english', :q),
    Product.category.ilike(f"%{category}%"),
    Product.price >= min_price,
    Product.price <= max_price,
    Product.source == platform,
).order_by(Product.updated_at.desc())

Optimizations:

  • is_active first for early filtering
  • Composite indexes support common filter combinations
  • GIN trigram index on category and brand for ILIKE queries

Facet/Analytics Query Pattern

The facets endpoint (/search/facets) runs multiple aggregate queries:

  • Category counts
  • Brand counts
  • Platform counts
  • Price range buckets
  • Rating range buckets

Optimization opportunity (IMPLEMENTED): Price bucket queries were refactored from N-query loop to single width_bucket() aggregate query. Rating ranges similarly optimized from 5-query loop to single CASE aggregate query. See app/routers/search.py lines 801-871.

Identifying Missing Indexes

-- Find queries that do sequential scans on large tables
SELECT query, rows, loop_count
FROM pg_stat_statements
WHERE query LIKE '%products%'
ORDER BY rows DESC
LIMIT 20;

Performance Monitoring

The query_log_latency_24h view (migration 037) provides API endpoint latency:

SELECT * FROM query_log_latency_24h ORDER BY avg_latency_ms DESC;

This helps correlate slow API endpoints with database query performance.