database-performance

Database Query Performance Analysis Guide

Overview

This document describes the slow query logging and analysis infrastructure for BuyWhere API.

Slow Query Logging

PostgreSQL slow query logging is enabled via migration 038_slow_query_logging_and_indexes:

log_min_duration_statement = '100ms' - Logs statements taking >100ms
pg_stat_statements extension enabled for query performance statistics

Analyzing Slow Queries

View Slow Query Logs

# Connect to PostgreSQL
psql $DATABASE_URL

# View recent slow queries from PostgreSQL logs
# (Logs go to stderr / container logs by default)

# Use pg_stat_statements to find expensive queries
SELECT query, calls, mean_time, total_time, rows
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 20;

Run the Slow Query Benchmark

cd /home/paperclip/buywhere-api
python slow_query_benchmark.py

This runs a suite of test queries and generates /home/paperclip/buywhere-api/slow_query_report.json with results.

Indexes Added

Migration 038_slow_query_logging_and_indexes adds these indexes:

Index Name	Columns	Use Case
`idx_products_active_in_stock`	`(is_active, in_stock)`	Filter by active and availability
`idx_products_active_source_in_stock`	`(is_active, source, in_stock)`	Filter by source with availability
`idx_products_active_source_price_updated`	`(is_active, source, price, updated_at)`	Filtered search with price ordering
`idx_products_active_brand_price`	`(is_active, brand, price)`	Brand-filtered search with price
`idx_products_active_category_price`	`(is_active, category, price)`	Category-filtered search with price
`idx_products_source_updated`	`(source, updated_at)`	Source-specific recent products

Common Query Patterns

Search Query Pattern

# Common search pattern in app/routers/search.py
select(Product).where(
    Product.is_active == True,
    search_vector @@ websearch_to_tsquery('english', :q),
    Product.category.ilike(f"%{category}%"),
    Product.price >= min_price,
    Product.price <= max_price,
    Product.source == platform,
).order_by(Product.updated_at.desc())

Optimizations:

is_active first for early filtering
Composite indexes support common filter combinations
GIN trigram index on category and brand for ILIKE queries

Facet/Analytics Query Pattern

The facets endpoint (/search/facets) runs multiple aggregate queries:

Category counts
Brand counts
Platform counts
Price range buckets
Rating range buckets

Optimization opportunity (IMPLEMENTED): Price bucket queries were refactored from N-query loop to single width_bucket() aggregate query. Rating ranges similarly optimized from 5-query loop to single CASE aggregate query. See app/routers/search.py lines 801-871.

Identifying Missing Indexes

-- Find queries that do sequential scans on large tables
SELECT query, rows, loop_count
FROM pg_stat_statements
WHERE query LIKE '%products%'
ORDER BY rows DESC
LIMIT 20;

Performance Monitoring

The query_log_latency_24h view (migration 037) provides API endpoint latency:

SELECT * FROM query_log_latency_24h ORDER BY avg_latency_ms DESC;

This helps correlate slow API endpoints with database query performance.