BuyWhere US Monitoring

Document Version: 2.0 Last Updated: 2026-04-18 Owner: Ops Team


Overview

This document covers uptime monitoring configuration for the BuyWhere US launch (April 23, 2026). Monitoring is configured to alert within 30-60 seconds of downtime detection.

Monitored Endpoints:

  • https://us.buywhere.com — US homepage
  • https://us.buywhere.com/api/health — US API health endpoint
  • https://us.buywhere.com/api/search?q=test — US API search endpoint

Alert Channels: Email to ops@buywhere.ai via Prometheus/Alertmanager webhook


Primary Monitoring: Prometheus Blackbox Exporter

US endpoint monitoring is handled by Prometheus blackbox exporter with 15-second scrape intervals and 30-second alert threshold, providing ~45 second detection time.

Configuration

Prometheus Jobs (prometheus.yml):

  • blackbox-us-website — monitors us.buywhere.com
  • blackbox-us-api-health — monitors us.buywhere.com/api/health
  • blackbox-us-api-search — monitors us.buywhere.com/api/search?q=test

Alert Rules (prometheus_alerts.yml):

  • USWebsiteDown — triggers when us.buywhere.com unreachable for 30s
  • USAPIHealthDown — triggers when /api/health unreachable for 30s
  • USAPISearchDown — triggers when /api/search unreachable for 30s

Alert Routing (alertmanager.yml):

  • Critical alerts → critical-alerts-pagerduty receiver → PagerDuty + webhook
  • Critical alerts → critical-alerts receiver → webhook
  • Warning alerts → warning-alerts receiver → webhook

Verification

# Check blackbox exporter targets
curl http://localhost:9115/probe?target=us.buywhere.com&module=http_2xx

# Check Prometheus targets
curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.labels.job | startswith("blackbox-us"))'

External Fallback: Better Stack (uptime.rs)

Better Stack provides independent external monitoring as backup. Free tier supports 1-minute check intervals.

Step 1: Create Better Stack Account

  1. Go to https://betterstack.com/uptime
  2. Sign up with email: ops@buywhere.ai
  3. Verify email and access dashboard at https://betterstack.com/dashboard

Step 2: Add Monitors

Friendly NameURLCheck IntervalLocations
us.buywhere.com - Homepagehttps://us.buywhere.com1 minuteUS East (Virginia), US West (California)
us.buywhere.com - API Healthhttps://us.buywhere.com/api/health1 minuteUS East (Virginia), US West (California)
us.buywhere.com - API Searchhttps://us.buywhere.com/api/search?q=test1 minuteUS East (Virginia), US West (California)

Monitor Settings for Each:

  • HTTP method: GET
  • Expected status code: 200
  • Timeout: 10 seconds
  • Resurrection window: 0 minutes (alert on first failure)

Step 3: Configure Alert Contacts

  1. Go to Settings → Alert Contacts
  2. Add email contact: ops@buywhere.ai
  3. Alert on: Down events only

Step 4: Create Status Page (Optional)

  1. Go to Status Pages → New Status Page
  2. Name: BuyWhere US Status
  3. Include all US monitors

Alternative: UptimeRobot (5-Minute Intervals)

UptimeRobot free tier supports 5-minute minimum intervals. Use if Better Stack is unavailable.


Self-Hosted Fallback (Cron + Script)

If external services are unavailable, deploy scripts/us_uptime_check.sh on any server with cron access.

Cron Setup

# Add to crontab
echo "* * * * * /app/scripts/us_uptime_check.sh" | sudo tee -a /var/spool/cron/crontabs/root

Alert Flow

  1. Downtime detected by Prometheus blackbox (15s scrape, 30s alert threshold)
  2. Alert fires → Prometheus alerts route to Alertmanager
  3. Alertmanager routes to receivers:
    • Critical alerts → PagerDuty (pages on-call) + webhook to http://api:8000/webhooks/alerts
    • Warning alerts → webhook to http://api:8000/webhooks/alerts
  4. API endpoint processes alert → sends email to ops@buywhere.ai
  5. Total detection time: ~45 seconds (within 1-minute requirement)

Related Documentation


End of Document