PERFORMANCE_BASELINE_BUY-2557

BUY-2557: Performance Baseline — p50/p95/p99 per Endpoint

Issue: BUY-2557 Date: 2026-04-16 Agent: Eng14 Hardware: Development machine (local Docker)

Status: COMPLETE — Baseline Metrics Collected

Performance benchmarks have been run on key API endpoints. The results reveal one critical bottleneck in the agent-native search pipeline and acceptable performance on core catalog endpoints.

Benchmark Methodology

Tool: Custom Python benchmark script (performance_baseline.py)
Requests per endpoint: 100
Concurrency: 10 parallel workers
Target: http://localhost:8000
Measurement: Client-side latency (includes network + processing)

Results Summary

Endpoint	p50 (ms)	p95 (ms)	p99 (ms)	Max (ms)	Status
Product Search	21.79	52.89	58.99	58.99	✅ OK
Product Search (popular)	19.82	26.29	32.85	32.85	✅ OK
Category Browse	29.12	33.18	35.15	35.15	✅ OK
Deals	24.84	47.90	50.82	50.82	✅ OK
Brands List	26.51	33.85	34.19	34.19	✅ OK
Agent Search	25.29	8260.32	8265.41	8265.41	🔴 CRITICAL
Agent Compare	32.04	40.65	43.76	43.76	✅ OK
V2 Product List	34.22	43.47	45.00	45.00	✅ OK
V2 Batch Lookup	36.01	44.45	47.82	47.82	✅ OK
Health Check	1057.92	1444.66	1533.55	1533.55	⚠️ SLOW
Catalog Stats	25.41	32.42	33.27	33.27	✅ OK
Categories	24.22	32.52	36.51	36.51	✅ OK

Critical Finding: Agent Search Bottleneck

The /v2/agents/search endpoint shows extreme latency outliers at p95/p99 (~8.26 seconds), while p50 remains acceptable (25ms). This indicates:

Intermittent blocking operation — Likely a synchronous call to external service (Typesense? Scraping pipeline?)
No request timeout — Long-running requests are allowed to complete rather than timing out
Connection pool exhaustion — Possible resource contention under concurrency

Recommended Actions

Add request timeout to /v2/agents/search (suggested: 2s max)
Investigate Typesense integration — is it timing out and retrying?
Add async job queue for long-running agent searches
Profile the slow path to identify blocking operation

Health Check Anomaly

The /v1/health endpoint shows unexpectedly high latency (~1s median). This endpoint performs:

Database connectivity check
Cache connectivity check
Product count aggregation

Root cause: cache_connected: false in health response — the health check may be retrying Redis connection.

Performance Targets vs Actual

Metric	Target	Current p95	Gap
Product Search	<200ms	52.89ms	✅ -147ms
Product Detail	<200ms	~35ms (est.)	✅ -165ms
Agent Search	<200ms	8260ms	🔴 +8060ms
Agent Compare	<200ms	40.65ms	✅ -159ms

Files Modified

performance_baseline.py — New benchmark script for BUY-2557

Next Steps

Investigate Agent Search bottleneck — Priority: HIGH
Fix Health Check Redis timeout — Priority: MEDIUM
Re-benchmark after fixes — Collect p50/p95/p99 again
Add to CI/CD — Run performance baseline on every PR

Hardware Context

Current baseline was run on development hardware (local Docker environment). Production hardware will have different characteristics. To establish production baselines:

# Run benchmark against staging/prod
python3 performance_baseline.py --host https://api-staging.buywhere.com