BUY-2478-product-detail-agent-field-audit

Product Detail Agent-Readable Field Audit

Issue: BUY-2478
Date: 2026-04-16
Auditor: Snap (Eng19)
Scope: V1 and V2 Product Detail endpoints (/products/{id}, /v2/products/{product_id})

Executive Summary

The BuyWhere product catalog API has a strong foundation for agent-native usage, but has significant gaps in fields required for AI agents to make purchase recommendations and provide accurate citations. This audit identifies 32 missing or suboptimal fields across 5 categories.

Current Field Coverage

V2 ProductResponse (`app/schemas/product.py:936-969`)

Currently exposed:

Basic: id, sku, source, merchant_id, name, description, price, currency, price_sgd
URLs: buy_url, affiliate_url, image_url
Categorization: brand, category, category_path
Ratings: rating, review_count
Availability: is_available, in_stock, stock_level, last_checked
Metadata: data_updated_at, metadata, updated_at
Agent: confidence_score, availability_prediction, competitor_count, data_freshness

Database Product Model (`app/models/product.py:19-87`)

Available but NOT exposed in API:

canonical_id - Cross-platform canonical product ID
barcode - UPC/EAN (only in V1 ProductResponse, missing from V2)
specs - JSONB structured specifications
created_at - First ingestion timestamp
rating_source, avg_rating - Rating provenance
title_search_vector, search_vector - Internal TSV (OK to keep internal)

Missing Fields by Category

Category 1: Citation & Reference Fields (Critical for Agents)

Field	Severity	Description	DB Column	Recommendation
`canonical_id`	CRITICAL	Cross-platform product grouping ID for dedup/citations	`products.canonical_id`	Add to V2ProductResponse
`source_url`	HIGH	Direct URL to product on source platform (for agent citations)	`products.url`	Rename/alias buy_url → source_url
`merchant_name`	HIGH	Human-readable store name for agent responses	`merchants.name` (join)	Add merchant join to detail endpoint
`barcode`	MEDIUM	UPC/EAN for product identification	`products.barcode`	Add to V2ProductResponse (missing)
`product_type`	LOW	Product type classification	`metadata_->>'product_type'`	Surface from metadata

Category 2: Data Provenance & Trust (Critical for Citation Quality)

Field	Severity	Description	DB Column	Recommendation
`ingested_at`	HIGH	When product was first scraped/ingested	`products.created_at`	Add as `ingested_at` alias
`scraper_version`	MEDIUM	Which scraper version created this record	`metadata_->>'scraper_version'`	Surface from metadata
`data_confidence`	HIGH	Data quality score (0-1)	Computed	Add computed field
`source_credibility`	MEDIUM	Platform reliability rating	Computed per source	Add enum: verified, trusted, standard
`last_verified`	MEDIUM	When price/availability was last confirmed	`products.last_checked`	Rename/alias

Category 3: Structured Specifications (High Value for Agents)

Field	Severity	Description	DB Column	Recommendation
`specs`	HIGH	Structured product specifications	`products.specs` (JSONB)	Add to V2ProductResponse (currently only in metadata_)
`specifications`	HIGH	Normalized/flattened specs for agents	Computed	Add computed field
`key_features`	MEDIUM	Top 3-5 product highlights	`metadata_->>'highlights'`	Surface from metadata

Category 4: Pricing Transparency (Critical for Purchase Decisions)

Field	Severity	Description	DB Column	Recommendation
`original_price`	HIGH	Price before discount	`metadata_->>'original_price'`	Surface explicitly
`discount_pct`	HIGH	Discount percentage	Computed	Add computed field
`shipping_cost`	HIGH	Shipping cost	`metadata_->>'shipping'`	Surface explicitly
`shipping_estimate`	MEDIUM	Delivery time estimate	`metadata_->>'shipping_days'`	Surface explicitly
`free_shipping`	MEDIUM	Boolean free shipping flag	`metadata_->>'free_shipping'`	Add computed boolean
`tax_info`	LOW	Tax applicability	`metadata_->>'tax'`	Add if available

Category 5: Agent Decision Support

Field	Severity	Description	Recommendation
`price_trend`	HIGH	30-day price trend (up/down/stable)	Currently computed in v1 only - add to v2
`trust_signals`	HIGH	Platform/method verification status	Add enum: verified_source, community_vetted
`return_policy`	MEDIUM	Return window and conditions	Surface from metadata if available
`promotions`	MEDIUM	Active deals/coupons	Add as array of active promotions
`warranty_info`	LOW	Warranty duration and coverage	Surface from metadata if available
`authenticitiy`	LOW	Authenticity guarantee	Add boolean flag

Gap Analysis: V1 vs V2

Field	V1 ProductResponse	V2 V2ProductResponse	Gap
`barcode`	✅ Yes	❌ No	Regressed in V2
`avg_rating`	✅ Yes	❌ No	Missing in V2
`rating_source`	✅ Yes	❌ No	Missing in V2
`price_trend`	✅ Computed	❌ Not computed	Not implemented in v2
`specs`	In metadata	In metadata	Not surfaced as top-level

Recommended Priority Fixes

Phase 1: Critical (Do First)

Add canonical_id to V2ProductResponse
Fix barcode regression in V2
Add price_trend computation to v2 endpoint
Add ingested_at / created_at as first_ingested
Add data_confidence computed field

Phase 2: High Value

Add merchant_name via join
Add source_url as explicit citation URL
Surface specs as top-level field
Add original_price, discount_pct, shipping_cost
Add trust_signals enum

Phase 3: Enhancement

Add promotions array
Add return_policy
Add rating_distribution
Add alternative_product_ids

Implementation Notes

Files to Modify:

app/schemas/product.py - Add new fields to V2ProductResponse
app/routers/v2.py:_map_v2_product() - Include new fields
app/routers/products.py - Fix barcode exposure in V1 if needed

Migration Risk:

Adding fields is low risk (additive changes)
Changing field names is high risk (breaking change)
Adding computed fields is low risk

Appendix: Field Severity Ratings

Rating	Definition	Action
CRITICAL	Breaks agent citation or purchase recommendation capability	Fix in Phase 1
HIGH	Significantly degrades agent decision quality	Fix in Phase 1-2
MEDIUM	Reduces answer completeness	Fix in Phase 2
LOW	Nice-to-have for specific use cases	Fix in Phase 3

Document generated as part of BUY-2478 audit task