The Gap Between Product Data and Agent-Ready Data
There's a significant gap between "we have product data" and "AI agents can actually use this data effectively." Most product APIs were designed for human consumption—rich, detailed, beautifully formatted responses that sound great in a browser but cost too many tokens for agents and create parsing complexity.
Real example: a typical e-commerce API response for a wireless headphone might include:
- 15 fields of marketing copy
- 3 image URLs (hero, thumbnail, alt views)
- HTML-formatted descriptions
- Nested arrays of specifications
- Related product suggestions
- User-generated reviews with avatars
For a human, this is great. For an AI agent paying per token, this is expensive noise.
What Agents Actually Need
AI agents working with product data typically need to:
- Find products matching a query
- Compare prices across sources
- Get structured details for decision-making
- Generate buy links for purchases
That's it. The rest is nice-to-have, not must-have.
Token Efficiency
Agents pay per token. A 10-result search response from a traditional e-commerce API might consume 600-1000 tokens in metadata alone. An agent-native API optimizes for this:
// Traditional API (estimated 800 tokens)
{
"product": {
"id": "PRD-12345",
"name": "Sony WH-1000XM5 Wireless Noise Cancelling Headphones - Midnight Black",
"tagline": "Industry-leading noise cancellation with premium sound quality",
"description": "<p>The Sony WH-1000XM5 represents the pinnacle...",
"images": {
"primary": "https://cdn.example.com/hero.jpg",
"thumbnail": "https://cdn.example.com/thumb.jpg",
"gallery": ["https://cdn.example.com/img1.jpg", "..."]
},
"specifications": [...],
"reviews": {...},
...
}
}
// Agent-native API (estimated 150 tokens)
{
"id": "bw_prod_8823",
"name": "Sony WH-1000XM5 Wireless Headphones",
"price_sgd": 398.00,
"source": "shopee_sg",
"buy_url": "https://shopee.sg/sony-wh-1000xm5",
"in_stock": true,
"rating": 4.8
}
Same information, 5x reduction in token cost.
Core Principles of Agent-Native Product Data
1. Fixed Schema, Always
Agents need to know what fields exist. A response that sometimes includes price and sometimes includes price_sgd (or price_cents, or min_price) creates fragile parsing logic.
# Agents need consistent field names
def parse_product(response):
return {
"id": response["id"],
"name": response["name"],
"price_sgd": response["price_sgd"], # Always SGD
"buy_url": response["buy_url"],
"in_stock": response["in_stock"] # Always boolean
}
No surprises. Same schema every time.
2. Commerce Signals, Not Just Data
Product data for agents needs to include commerce-ready signals:
{
"confidence_score": 0.94,
"availability_prediction": "likely_in_stock",
"price_trend": "stable",
"affiliate_url": "https://affiliate.buywhere.ai/track/8823"
}
confidence_scoretells the agent how reliable this data isavailability_predictiongoes beyond current stock to forecast availabilityprice_trendhelps agents advise on timingaffiliate_urlenables commission-bearing purchases
3. Normalized Across Sources
When you search for "iPhone 15 Pro" across Shopee, Lazada, and Amazon.sg, you should get the same product matched across platforms—not different iPhone listings that happen to have similar names.
{
"matched_products": [
{
"id": "bw_prod_12345",
"sources": ["shopee_sg", "lazada_sg", "amazon_sg"],
"price_range_sgd": { "min": 1249, "max": 1399 },
"lowest_price": {
"source": "shopee_sg",
"price_sgd": 1249.00
}
}
]
}
The agent doesn't need to know how to match products—that's the catalog's job.
4. Error Recovery Signals
When something goes wrong, agents need actionable errors:
{
"error": {
"code": "RATE_LIMITED",
"message": "Too many requests",
"retry_after_seconds": 60,
"suggestion": "Reduce request frequency or upgrade to Pro tier"
}
}
Not just "error occurred" but "here's what happened and what to do next."
Implementation: Building an Agent-Native Response Handler
import requests
from dataclasses import dataclass
from typing import Optional, List
@dataclass
class Product:
id: str
name: str
price_sgd: float
source: str
buy_url: str
in_stock: bool
confidence_score: float
affiliate_url: Optional[str] = None
@classmethod
def from_api_response(cls, data: dict) -> "Product":
return cls(
id=data["id"],
name=data["name"],
price_sgd=data["price_sgd"],
source=data["source"],
buy_url=data["buy_url"],
in_stock=data["in_stock"],
confidence_score=data.get("confidence_score", 0.5),
affiliate_url=data.get("affiliate_url")
)
class BuyWhereClient:
def __init__(self, api_key: str):
self.api_key = api_key
self.base_url = "https://api.buywhere.ai/v2"
def search(self, query: str, limit: int = 10) -> List[Product]:
response = requests.get(
f"{self.base_url}/products",
headers={"Authorization": f"Bearer {self.api_key}"},
params={"q": query, "region": "sg", "limit": limit}
)
if response.status_code != 200:
self._handle_error(response)
items = response.json().get("items", [])
return [Product.from_api_response(item) for item in items]
def find_cheapest(self, query: str) -> Optional[Product]:
products = self.search(query, limit=20)
in_stock = [p for p in products if p.in_stock]
if not in_stock:
return None
return min(in_stock, key=lambda p: p.price_sgd)
def _handle_error(self, response):
error = response.json().get("error", {})
code = error.get("code", "UNKNOWN")
if code == "RATE_LIMITED":
raise RateLimitError(error.get("retry_after_seconds", 60))
elif code == "INVALID_API_KEY":
raise AuthError("Check your API key")
else:
raise APIError(f"{code}: {error.get('message', 'Unknown error')}")
# Usage
client = BuyWhereClient("bw_live_xxxxx")
cheapest = client.find_cheapest("sony wh-1000xm5")
if cheapest:
print(f"Best price: {cheapest.name} at S${cheapest.price_sgd}")
print(f"Buy link: {cheapest.buy_url}")
The Trade-offs
Agent-native design isn't free. It involves explicit choices:
What you lose:
- Rich marketing copy and imagery
- Deep product specifications
- User reviews and ratings breakdown
- Related products and recommendations
What you gain:
- Predictable parsing
- Lower token costs
- Faster response times
- Reliable agent behavior
For shopping agents focused on price comparison and purchase routing, the trade-off makes sense. For content generation or marketing applications, you'd want richer data.
Schema.org Compatibility
Agent-native doesn't mean non-standard. BuyWhere responses follow Schema.org Product vocabulary:
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Sony WH-1000XM5 Wireless Headphones",
"brand": {"@type": "Brand", "name": "Sony"},
"offers": {
"@type": "Offer",
"priceCurrency": "SGD",
"price": "398.00",
"availability": "https://schema.org/InStock"
}
}
This means agents can use standard Schema.org parsing logic and still get agent-native efficiency.
Getting Started
If you're building an AI agent that works with products, the data layer matters as much as the model layer. Clean, consistent, token-efficient product data enables reliable agent behavior.
Start with a clean API, not a rich one. You can always add complexity later if needed.
Get an API key at api.buywhere.ai and focus on what your agent should do with the data—not how to parse it.