monitoring

BuyWhere US Monitoring

Document Version: 2.0 Last Updated: 2026-04-18 Owner: Ops Team

Overview

This document covers uptime monitoring configuration for the BuyWhere US launch (April 23, 2026). Monitoring is configured to alert within 30-60 seconds of downtime detection.

Monitored Endpoints:

https://us.buywhere.com — US homepage
https://us.buywhere.com/api/health — US API health endpoint
https://us.buywhere.com/api/search?q=test — US API search endpoint

Alert Channels: Email to ops@buywhere.ai via Prometheus/Alertmanager webhook

Primary Monitoring: Prometheus Blackbox Exporter

US endpoint monitoring is handled by Prometheus blackbox exporter with 15-second scrape intervals and 30-second alert threshold, providing ~45 second detection time.

Configuration

Prometheus Jobs (prometheus.yml):

blackbox-us-website — monitors us.buywhere.com
blackbox-us-api-health — monitors us.buywhere.com/api/health
blackbox-us-api-search — monitors us.buywhere.com/api/search?q=test

Alert Rules (prometheus_alerts.yml):

USWebsiteDown — triggers when us.buywhere.com unreachable for 30s
USAPIHealthDown — triggers when /api/health unreachable for 30s
USAPISearchDown — triggers when /api/search unreachable for 30s

Alert Routing (alertmanager.yml):

Critical alerts → critical-alerts-pagerduty receiver → PagerDuty + webhook
Critical alerts → critical-alerts receiver → webhook
Warning alerts → warning-alerts receiver → webhook

Verification

# Check blackbox exporter targets
curl http://localhost:9115/probe?target=us.buywhere.com&module=http_2xx

# Check Prometheus targets
curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.labels.job | startswith("blackbox-us"))'

External Fallback: Better Stack (uptime.rs)

Better Stack provides independent external monitoring as backup. Free tier supports 1-minute check intervals.

Step 1: Create Better Stack Account

Go to https://betterstack.com/uptime
Sign up with email: ops@buywhere.ai
Verify email and access dashboard at https://betterstack.com/dashboard

Step 2: Add Monitors

Friendly Name	URL	Check Interval	Locations
`us.buywhere.com - Homepage`	`https://us.buywhere.com`	1 minute	US East (Virginia), US West (California)
`us.buywhere.com - API Health`	`https://us.buywhere.com/api/health`	1 minute	US East (Virginia), US West (California)
`us.buywhere.com - API Search`	`https://us.buywhere.com/api/search?q=test`	1 minute	US East (Virginia), US West (California)

Monitor Settings for Each:

HTTP method: GET
Expected status code: 200
Timeout: 10 seconds
Resurrection window: 0 minutes (alert on first failure)

Step 3: Configure Alert Contacts

Go to Settings → Alert Contacts
Add email contact: ops@buywhere.ai
Alert on: Down events only

Step 4: Create Status Page (Optional)

Go to Status Pages → New Status Page
Name: BuyWhere US Status
Include all US monitors

Alternative: UptimeRobot (5-Minute Intervals)

UptimeRobot free tier supports 5-minute minimum intervals. Use if Better Stack is unavailable.

Dashboard: https://uptimerobot.com/dashboard
Account: ops@buywhere.ai

Self-Hosted Fallback (Cron + Script)

If external services are unavailable, deploy scripts/us_uptime_check.sh on any server with cron access.

Cron Setup

# Add to crontab
echo "* * * * * /app/scripts/us_uptime_check.sh" | sudo tee -a /var/spool/cron/crontabs/root

Alert Flow

Downtime detected by Prometheus blackbox (15s scrape, 30s alert threshold)
Alert fires → Prometheus alerts route to Alertmanager
Alertmanager routes to receivers:
- Critical alerts → PagerDuty (pages on-call) + webhook to http://api:8000/webhooks/alerts
- Warning alerts → webhook to http://api:8000/webhooks/alerts
API endpoint processes alert → sends email to ops@buywhere.ai
Total detection time: ~45 seconds (within 1-minute requirement)