[UNIT 4] Name: buywhere-logging-schema Description: Centralized logging schema for BuyWhere microservices Version: 1.0.0

Structured Log Format Specification

All microservices MUST emit logs in JSON format with the following standard fields:

Required Fields

FieldTypeDescription
timestampISO8601UTC timestamp with milliseconds
levelstringLog level: DEBUG, INFO, WARN, ERROR, CRITICAL
servicestringService identifier (e.g., "api", "scraper-scheduler")
messagestringHuman-readable log message

Optional Fields

FieldTypeDescription
request_idstringUUID for request tracing
user_idstringAuthenticated user identifier
duration_msfloatOperation duration in milliseconds
errorobjectError details when level=ERROR
metadataobjectAdditional contextual data

Service-Specific Fields

API Service

  • endpoint: API endpoint path
  • method: HTTP method
  • status_code: Response status code
  • client_ip: Client IP address
  • user_agent: Client user agent

Scraper Services

  • scraper_name: Name of the scraper
  • items_scraped: Number of items scraped
  • items_total: Total items to scrape
  • url: Target URL

Database Services

  • query: SQL query (sanitized)
  • duration_ms: Query execution time
  • rows_affected: Number of rows

Background Jobs

  • job_name: Name of the job
  • job_id: Unique job identifier
  • attempt: Job attempt number

Log Levels

  • DEBUG: Detailed debugging information
  • INFO: General operational information
  • WARN: Warning conditions (recoverable issues)
  • ERROR: Error conditions (action required)
  • CRITICAL: System is unusable

Implementation

Python (structlog)

import structlog
import logging

structlog.configure(
    processors=[
        structlog.stdlib.filter_by_level,
        structlog.stdlib.add_logger_name,
        structlog.stdlib.add_log_level,
        structlog.stdlib.PositionalArgumentsFormatter(),
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.StackInfoRenderer(),
        structlog.processors.format_exc_info,
        structlog.processors.UnicodeDecoder(),
        structlog.processors.JSONRenderer()
    ],
    context_class=dict,
    logger_factory=structlog.stdlib.LoggerFactory(),
    cache_logger_on_first_use=True,
)

Fluent Bit Parsing

The Fluent Bit configuration includes parsers for:

  • Docker container logs (json)
  • Kubernetes pod logs (json)
  • Syslog format
  • Logfmt format

Index Naming Convention

Loki/Elasticsearch indices should follow:

  • logs-{service}-YYYY.MM.DD for daily indices
  • logs-{service}-YYYY.MM for monthly retention

Retention Policy

  • Hot storage (SSD): 7 days
  • Warm storage (HDD): 30 days
  • Cold storage (object): 90 days
  • Archive: 1 year (compressed)