← Back to documentation

PERFORMANCE_BASELINE_BUY-2557

BUY-2557: Performance Baseline — p50/p95/p99 per Endpoint

Issue: BUY-2557 Date: 2026-04-16 Agent: Eng14 Hardware: Development machine (local Docker)


Status: COMPLETE — Baseline Metrics Collected

Performance benchmarks have been run on key API endpoints. The results reveal one critical bottleneck in the agent-native search pipeline and acceptable performance on core catalog endpoints.


Benchmark Methodology

  • Tool: Custom Python benchmark script (performance_baseline.py)
  • Requests per endpoint: 100
  • Concurrency: 10 parallel workers
  • Target: http://localhost:8000
  • Measurement: Client-side latency (includes network + processing)

Results Summary

Endpointp50 (ms)p95 (ms)p99 (ms)Max (ms)Status
Product Search21.7952.8958.9958.99✅ OK
Product Search (popular)19.8226.2932.8532.85✅ OK
Category Browse29.1233.1835.1535.15✅ OK
Deals24.8447.9050.8250.82✅ OK
Brands List26.5133.8534.1934.19✅ OK
Agent Search25.298260.328265.418265.41🔴 CRITICAL
Agent Compare32.0440.6543.7643.76✅ OK
V2 Product List34.2243.4745.0045.00✅ OK
V2 Batch Lookup36.0144.4547.8247.82✅ OK
Health Check1057.921444.661533.551533.55⚠️ SLOW
Catalog Stats25.4132.4233.2733.27✅ OK
Categories24.2232.5236.5136.51✅ OK

Critical Finding: Agent Search Bottleneck

The /v2/agents/search endpoint shows extreme latency outliers at p95/p99 (~8.26 seconds), while p50 remains acceptable (25ms). This indicates:

  1. Intermittent blocking operation — Likely a synchronous call to external service (Typesense? Scraping pipeline?)
  2. No request timeout — Long-running requests are allowed to complete rather than timing out
  3. Connection pool exhaustion — Possible resource contention under concurrency

Recommended Actions

  1. Add request timeout to /v2/agents/search (suggested: 2s max)
  2. Investigate Typesense integration — is it timing out and retrying?
  3. Add async job queue for long-running agent searches
  4. Profile the slow path to identify blocking operation

Health Check Anomaly

The /v1/health endpoint shows unexpectedly high latency (~1s median). This endpoint performs:

  • Database connectivity check
  • Cache connectivity check
  • Product count aggregation

Root cause: cache_connected: false in health response — the health check may be retrying Redis connection.


Performance Targets vs Actual

MetricTargetCurrent p95Gap
Product Search<200ms52.89ms✅ -147ms
Product Detail<200ms~35ms (est.)✅ -165ms
Agent Search<200ms8260ms🔴 +8060ms
Agent Compare<200ms40.65ms✅ -159ms

Files Modified

  • performance_baseline.py — New benchmark script for BUY-2557

Next Steps

  1. Investigate Agent Search bottleneck — Priority: HIGH
  2. Fix Health Check Redis timeout — Priority: MEDIUM
  3. Re-benchmark after fixes — Collect p50/p95/p99 again
  4. Add to CI/CD — Run performance baseline on every PR

Hardware Context

Current baseline was run on development hardware (local Docker environment). Production hardware will have different characteristics. To establish production baselines:

# Run benchmark against staging/prod
python3 performance_baseline.py --host https://api-staging.buywhere.com