BUY-2557: Performance Baseline — p50/p95/p99 per Endpoint
Issue: BUY-2557 Date: 2026-04-16 Agent: Eng14 Hardware: Development machine (local Docker)
Status: COMPLETE — Baseline Metrics Collected
Performance benchmarks have been run on key API endpoints. The results reveal one critical bottleneck in the agent-native search pipeline and acceptable performance on core catalog endpoints.
Benchmark Methodology
- Tool: Custom Python benchmark script (
performance_baseline.py) - Requests per endpoint: 100
- Concurrency: 10 parallel workers
- Target:
http://localhost:8000 - Measurement: Client-side latency (includes network + processing)
Results Summary
| Endpoint | p50 (ms) | p95 (ms) | p99 (ms) | Max (ms) | Status |
|---|---|---|---|---|---|
| Product Search | 21.79 | 52.89 | 58.99 | 58.99 | ✅ OK |
| Product Search (popular) | 19.82 | 26.29 | 32.85 | 32.85 | ✅ OK |
| Category Browse | 29.12 | 33.18 | 35.15 | 35.15 | ✅ OK |
| Deals | 24.84 | 47.90 | 50.82 | 50.82 | ✅ OK |
| Brands List | 26.51 | 33.85 | 34.19 | 34.19 | ✅ OK |
| Agent Search | 25.29 | 8260.32 | 8265.41 | 8265.41 | 🔴 CRITICAL |
| Agent Compare | 32.04 | 40.65 | 43.76 | 43.76 | ✅ OK |
| V2 Product List | 34.22 | 43.47 | 45.00 | 45.00 | ✅ OK |
| V2 Batch Lookup | 36.01 | 44.45 | 47.82 | 47.82 | ✅ OK |
| Health Check | 1057.92 | 1444.66 | 1533.55 | 1533.55 | ⚠️ SLOW |
| Catalog Stats | 25.41 | 32.42 | 33.27 | 33.27 | ✅ OK |
| Categories | 24.22 | 32.52 | 36.51 | 36.51 | ✅ OK |
Critical Finding: Agent Search Bottleneck
The /v2/agents/search endpoint shows extreme latency outliers at p95/p99 (~8.26 seconds), while p50 remains acceptable (25ms). This indicates:
- Intermittent blocking operation — Likely a synchronous call to external service (Typesense? Scraping pipeline?)
- No request timeout — Long-running requests are allowed to complete rather than timing out
- Connection pool exhaustion — Possible resource contention under concurrency
Recommended Actions
- Add request timeout to
/v2/agents/search(suggested: 2s max) - Investigate Typesense integration — is it timing out and retrying?
- Add async job queue for long-running agent searches
- Profile the slow path to identify blocking operation
Health Check Anomaly
The /v1/health endpoint shows unexpectedly high latency (~1s median). This endpoint performs:
- Database connectivity check
- Cache connectivity check
- Product count aggregation
Root cause: cache_connected: false in health response — the health check may be retrying Redis connection.
Performance Targets vs Actual
| Metric | Target | Current p95 | Gap |
|---|---|---|---|
| Product Search | <200ms | 52.89ms | ✅ -147ms |
| Product Detail | <200ms | ~35ms (est.) | ✅ -165ms |
| Agent Search | <200ms | 8260ms | 🔴 +8060ms |
| Agent Compare | <200ms | 40.65ms | ✅ -159ms |
Files Modified
performance_baseline.py— New benchmark script for BUY-2557
Next Steps
- Investigate Agent Search bottleneck — Priority: HIGH
- Fix Health Check Redis timeout — Priority: MEDIUM
- Re-benchmark after fixes — Collect p50/p95/p99 again
- Add to CI/CD — Run performance baseline on every PR
Hardware Context
Current baseline was run on development hardware (local Docker environment). Production hardware will have different characteristics. To establish production baselines:
# Run benchmark against staging/prod
python3 performance_baseline.py --host https://api-staging.buywhere.com