Amazon Product Rating Scorer

Score and rank Amazon search results using a statistically valid Bayesian system that accounts for average rating, review count, and star distribution.

When to use

When the user asks to score, rank, or compare Amazon products from a search results page open in their browser.

Prerequisites

Browser must be open with an Amazon search results page
Chrome DevTools MCP server must be connected

Methodology: Bayesian Lower Confidence Bound

The scoring system uses a Dirichlet-Multinomial Bayesian model (same statistical framework as Reddit's "best" ranking and IMDB Top 250).

Why this works

A single formula naturally handles three requirements:

Average rating: higher average = higher score
Number of reviews: more reviews = lower standard error = score stays close to the true mean
Star distribution: tight/consistent distributions have lower variance = higher score; bimodal distributions (lots of 5s and 1s) get penalized

The key insight

A Dirichlet prior of alpha=2 per star level adds 10 "phantom reviews" (2 at each star). With few real reviews, the prior dominates and pulls the score toward 3.0. With hundreds of reviews, the prior is negligible. This is why 2x 5-star reviews score much lower than 100x 5-star reviews.

Formula

For each product:
1. counts[s] = (histogram_pct[s] / 100) * total_reviews + alpha   (for s = 1..5)
2. N_adj = sum(counts)
3. mean = sum(s * counts[s] / N_adj)                               (posterior mean)
4. variance = sum((s - mean)^2 * counts[s] / N_adj)                (posterior variance)
5. SE = sqrt(variance / N_adj)                                     (standard error of the mean)
6. SCORE = mean - 1.65 * SE                                        (90% lower confidence bound)

Parameters:

alpha = 2 (prior pseudo-counts per star level)
z = 1.65 (90% one-tailed confidence z-score)

Fake Review Detection

After collecting histograms, run a suspicion analysis using 7 statistical signals. Each signal adds to a cumulative suspicion score; products are then classified as HIGH (>=40), MEDIUM (>=20), or LOW (>0) risk.

Signals

Missing middle — organic dissatisfaction spreads across 2, 3, and 4 stars. If those combined are under 10%, it suggests the middle has been artificially hollowed out. Weight: (10 - middle%) * 3
Unnatural 5-star concentration — 95%+ five-star is almost never organic. 85%+ with under 100 reviews is also suspicious. Weight: 40 (>=95%) or 20 (>=85%, <100 reviews)
Zero 1-star with 50+ reviews — statistically improbable; even great products attract the odd unhappy buyer. Weight: 15
No 2-star or 3-star at all (20+ reviews) — real unhappy customers leave a range of negative ratings, not just 1-star. Weight: 20
Low distribution entropy — Shannon entropy measures how spread out ratings are. Fake reviews cluster tightly (low entropy <0.8; organic typically >1.2; max is 2.32). Weight: 25 (<0.8) or 10 (<1.0 with 30+ reviews)
5-star rate far above category average — compute the weighted average 5-star % across all products in the search. Products 15+ points above that are unusual. Weight: deviation - 10
1-star cliff — high 1-star (>=8%) but almost no 2-star (<=1%) suggests competitor attack reviews rather than organic dissatisfaction. Weight: 15

Interpreting results

HIGH risk: likely manipulated — present these separately, do not recommend
MEDIUM risk: worth scrutinising — flag in the ranking table but don't exclude
LOW risk: minor anomaly — note but don't penalise
CLEAN: no flags triggered

Note: low entropy can also indicate a genuinely excellent or genuinely terrible product. MEDIUM flags on high-volume products (500+ reviews) are less concerning than on low-volume ones. Always recommend the user manually check review text for high-scoring products that flag MEDIUM+.

Step-by-step execution

Step 1: Extract product data from the search results page

Run this JavaScript via evaluate_script on the Amazon search results page:

() => {
  const allDivs = document.querySelectorAll('div[data-asin]');
  const seen = new Set();
  const results = [];
  allDivs.forEach((div) => {
    const asin = div.getAttribute('data-asin');
    if (!asin || asin === '' || seen.has(asin)) return;
    seen.add(asin);
    const titleEl = div.querySelector('h2');
    const ratingEl = div.querySelector('span.a-icon-alt');
    const reviewSpan = div.querySelector('span.a-size-mini.puis-normal-weight-text');
    let reviewCount = null;
    if (reviewSpan) {
      const match = reviewSpan.textContent.trim().match(/\(?([\d,]+)\)?/);
      if (match) reviewCount = parseInt(match[1].replace(/,/g, ''));
    }
    const priceEl = div.querySelector('.a-price .a-offscreen');
    if (titleEl && ratingEl) {
      const ratingText = ratingEl.textContent.trim();
      const ratingMatch = ratingText.match(/([\d.]+)/);
      results.push({
        asin,
        title: titleEl.textContent.trim().substring(0, 100),
        rating: ratingMatch ? parseFloat(ratingMatch[1]) : null,
        reviewCount,
        price: priceEl ? priceEl.textContent.trim() : null
      });
    }
  });
  return results;
}

Step 2: Fetch star distribution histograms

For each product ASIN, fetch its product page and extract the histogram. Batch in groups of ~20 to avoid overwhelming the browser:

async () => {
  const asins = [/* array of ASINs from step 1 */];
  const results = {};
  for (const asin of asins) {
    try {
      const resp = await fetch(`/dp/${asin}/`);
      const html = await resp.text();
      const matches = [...html.matchAll(/aria-label="(\d+) percent of reviews have (\d) star/g)];
      if (matches.length >= 5) {
        const hist = {};
        matches.forEach(m => { hist[parseInt(m[2])] = parseInt(m[1]); });
        results[asin] = hist;
      }
    } catch(e) { }
  }
  return results;
}

Step 3: Detect suspicious review patterns

Run after Step 2. Computes the category-average distribution as a baseline, then checks each product against the 7 signals. Pass the same merged product array.

() => {
  const products = [/* merged data from steps 1 and 2 */];

  // Compute category-average 5-star rate as baseline
  let totalReviews = 0;
  const avgHist = {1:0,2:0,3:0,4:0,5:0};
  products.forEach(p => {
    totalReviews += p.reviews;
    for (let s = 1; s <= 5; s++) avgHist[s] += p.hist[s] * p.reviews;
  });
  for (let s = 1; s <= 5; s++) avgHist[s] /= totalReviews;
  const avgTotal = Object.values(avgHist).reduce((a,b) => a+b, 0);
  for (let s = 1; s <= 5; s++) avgHist[s] = Math.round(avgHist[s] / avgTotal * 100);

  return products.map(p => {
    const signals = [];
    let sus = 0;

    const middle = p.hist[2] + p.hist[3] + p.hist[4];
    if (middle < 10) { signals.push(`Missing middle: ${middle}% in 2-4 stars`); sus += (10 - middle) * 3; }

    if (p.hist[5] >= 95) { signals.push(`${p.hist[5]}% five-star`); sus += 40; }
    else if (p.hist[5] >= 85 && p.reviews < 100) { signals.push(`${p.hist[5]}% five-star with ${p.reviews} reviews`); sus += 20; }

    if (p.hist[1] === 0 && p.reviews >= 50) { signals.push(`Zero 1-star across ${p.reviews} reviews`); sus += 15; }
    if (p.hist[2] === 0 && p.hist[3] === 0 && p.reviews >= 20) { signals.push(`No 2 or 3-star reviews`); sus += 20; }

    let entropy = 0;
    for (let s = 1; s <= 5; s++) { const x = p.hist[s]/100; if (x > 0) entropy -= x * Math.log2(x); }
    if (entropy < 0.8) { signals.push(`Very low entropy: ${entropy.toFixed(2)}`); sus += 25; }
    else if (entropy < 1.0 && p.reviews >= 30) { signals.push(`Low entropy: ${entropy.toFixed(2)}`); sus += 10; }

    const dev = p.hist[5] - avgHist[5];
    if (dev > 15) { signals.push(`5-star rate ${dev}pp above category avg`); sus += dev - 10; }

    if (p.hist[1] >= 8 && p.hist[2] <= 1 && p.reviews >= 30) { signals.push(`1-star cliff: ${p.hist[1]}% vs ${p.hist[2]}% two-star`); sus += 15; }

    return {
      asin: p.asin, title: p.title, reviews: p.reviews,
      entropy: parseFloat(entropy.toFixed(2)),
      suspicionScore: sus, signals,
      risk: sus >= 40 ? 'HIGH' : sus >= 20 ? 'MEDIUM' : sus > 0 ? 'LOW' : 'CLEAN'
    };
  }).filter(p => p.suspicionScore > 0).sort((a, b) => b.suspicionScore - a.suspicionScore);
}

Step 4: Compute scores and rank

() => {
  const products = [/* merged data from steps 1 and 2 */];
  const ALPHA = 2;
  const Z = 1.65;

  const scored = products.map(p => {
    const n = p.reviews || 1;
    const counts = {};
    for (let s = 1; s <= 5; s++) {
      counts[s] = (p.hist[s] / 100) * n + ALPHA;
    }
    const totalAdj = Object.values(counts).reduce((a, b) => a + b, 0);
    let mean = 0;
    for (let s = 1; s <= 5; s++) mean += s * (counts[s] / totalAdj);
    let variance = 0;
    for (let s = 1; s <= 5; s++) variance += (s - mean) ** 2 * (counts[s] / totalAdj);
    const se = Math.sqrt(variance / totalAdj);
    const score = mean - Z * se;
    const satisfaction = ((counts[4] + counts[5]) / totalAdj) * 100;
    return {
      title: p.title, price: p.price, reviews: n,
      avgRating: p.rating,
      posteriorMean: Math.round(mean * 100) / 100,
      score: Math.round(score * 1000) / 1000,
      satisfaction: Math.round(satisfaction),
      defectRate: Math.round((counts[1] / totalAdj) * 100)
    };
  });
  scored.sort((a, b) => b.score - a.score);
  return scored;
}

Step 5: Get product links

() => {
  const asins = [/* top ASIN list */];
  const results = {};
  document.querySelectorAll('div[data-asin]').forEach(card => {
    const asin = card.getAttribute('data-asin');
    if (asins.includes(asin) && !results[asin]) {
      for (const a of card.querySelectorAll('a')) {
        const href = a.getAttribute('href') || '';
        if (href.includes('/dp/')) { results[asin] = href; break; }
      }
    }
  });
  return results;
}

Construct full URLs: https://www.amazon.co.uk + the returned href path (strip query params for cleanliness).

Output format

Present results as a ranked table with columns: Rank, Score, Adjusted Avg, Reviews, Satisfaction%, Risk, Price, Product Name.

Always include:

Top N overall (by score) — exclude HIGH risk products from recommendations
Top N value picks (high score + low price)
Flagged products — separate table listing HIGH and MEDIUM risk products with their signals
Brief explanation of why high-rated low-review products rank lower
Note if any top-ranked products have MEDIUM risk flags, and recommend the user check the review text manually

Notes

The aria-label pattern for histogram extraction works on .co.uk and .com — the format is "X percent of reviews have Y star"
Products with < 5 matched histogram entries may have a different page layout; skip or flag them
Deduplicate by ASIN before scoring (Amazon shows the same product in multiple slots)
The review count on the search page sometimes differs from the product page; the search page count is sufficient for scoring
This approach works for any Amazon product category

sam-artuso/SKILL.md

Select an option

No results found