Skip to main content

How we score mattresses and beds

Methodology last updated: 12 May 2026

BedBoy is a comparison engine. Every product on the site carries a BedBoy Score: a composite built from keyword analysis of verified-purchase reviews, construction-type benchmarks, retailer policy data, and aggregated customer rating data. This page explains exactly how each input is measured, where the data comes from, and where the model has limits.

What the BedBoy Score is

The headline BedBoy Score visible at the top of every product page is the unweighted mean of every scored category for that product, on a five-point scale, rounded to one decimal place. A 4.2 means the product averages 4.2 out of 5 across the categories we can score. The per-category breakdown on the same page shows exactly where the points came from — and crucially, where the evidence behind each figure was drawn from.

The score is an algorithm applied consistently across hundreds of products so the comparison stays fair. It is not the opinion of a single tester and it does not depend on which reviewer happened to leave feedback on a given Tuesday. Every BedBoy Score page includes a per-category evidence panel that shows the signal counts and derivation method behind every figure.

The scoring categories

Categories are segmented by product type. Edge support and motion isolation are mattress concepts; assembly experience and storage are bed-frame concepts; washability is a bedding concept. A product is only scored against categories that actually apply to it. Two categories — retailer guarantee and customer sentiment — are universal and appear on every product.

Mattresses

  1. Comfort. Derived from keyword signals in verified-purchase reviews. The model counts positive mentions (comfortable, supportive, good night’s sleep, and related phrases) against negative mentions (uncomfortable, too firm, too soft, and related phrases). The net signal as a percentage of the review sample determines the score against a published five-point rubric.
  2. Edge support. Where reviewer commentary on edge support is sufficient, the score follows the same keyword-signal method. Where it is sparse, the score falls back to a construction-type benchmark: pocket spring and hybrid builds score higher; open-coil and memory-foam builds score lower. The evidence panel on the product page states which method was used.
  3. Motion isolation. Same dual-source method as edge support. Reviewer commentary from couples and combination sleepers drives the score where available; construction type provides the fallback signal where it does not.
  4. Heat retention. Derived from keyword signals in verified-purchase reviews. Positive mentions (cool, breathable, temperature-neutral) versus negative mentions (hot, sweaty, night sweats) drive the score. Memory-foam-dominant products naturally attract more negative heat signals; airflow-engineered hybrids attract fewer. Gel-infused and cool-tech variants receive an uplift in the construction-type fallback.
  5. Value. Derived from keyword signals in verified-purchase reviews. Positive mentions (worth the money, good value, would buy again) against negative mentions (overpriced, not worth it, poor quality for the price) determine the score. Price data from the retailer listing informs the background context but the reviewer signal is the primary input.

Bed frames

  1. Build quality. Sturdy construction; absence of creaks and wobble in real-world use. Keyword signals from owner reviews.
  2. Assembly experience. Whether the flat-pack went together easily, clarity of instructions, missing-parts incidents.
  3. Storage. For ottoman and divan-base frames: lift mechanism, hydraulic struts, usable storage volume. Marked “not scored” on frames with no storage.
  4. Style and finish. Appearance, fabric or upholstery quality, perceived value of the finished frame in the room.
  5. Mattress fit. Whether the buyer’s mattress sits cleanly inside the frame without gaps or sliding.

Headboards

  1. Build quality. Sturdy construction; rigid mounting; absence of wobble against the wall.
  2. Mounting and fit. Ease of attaching to a divan or fixing to the wall; correctness of supplied fittings; sizing for the bed.
  3. Cushioning. Padding depth and comfort for sitting up against the headboard.
  4. Style and finish. Appearance, fabric quality, perceived premium feel.

Bedding (duvets, toppers, throws, sets)

  1. Warmth. Whether the product delivers on its stated temperature promise — TOG ratings on duvets, season suitability on throws.
  2. Filling quality. Loft, plumpness, evenness of fill; resistance to clumping or flattening over time.
  3. Washability. Holds shape and colour through machine washing; shrinkage or fading issues.
  4. Materials. Fabric softness, breathability, perceived quality of cotton, microfibre, or down.

Kids beds

  1. Safety. Absence of pinch points, secure rails, no hazardous gaps or sharp edges.
  2. Sturdiness. Whether the frame stands up to jumping, climbing, and the general wear of children using the bed.
  3. Age fit. Size and design suited to the age range it was bought for.
  4. Style. Appeal to the child; visual fit with their bedroom.

Universal categories (every product type)

  1. Retailer guarantee. Derived from the published trial period, manufacturer warranty term, and return policy of the retailer that lists the product. The score is the average of three equally-weighted sub-scores: trial length (5/5 at 100+ nights, 4/5 at 60-99, 3/5 at 30-59, 2/5 at 14-29, 1/5 below 14); warranty term (5/5 at 10+ years, 4/5 at 5-9, 3/5 at 2-4, 2/5 at 1 year, 1/5 below); and return cover (free comfort exchange scores highest, free returns next, paid collection mid-band, no returns lowest). The evidence panel lists the exact policy terms used.
  2. Customer sentiment. The raw mean star rating across every active verified-purchase review for the product, on the retailer’s native 1-to-5 scale, rounded to one decimal place. Requires at least twenty reviews; below that threshold the category is marked not scored and excluded from the composite. This category is deliberately separate from any keyword-driven category so the headline signal of “what buyers actually rated this product overall” sits alongside the more granular keyword-based categories.

The 0 to 5 scoring scale

Every category is scored on a five-point scale. We use whole and half points only.

  • 5 / 5 Excellent. Strongly net-positive reviewer commentary or best-in-class construction benchmark.
  • 4 / 5 Good. Above-average net signal. Most buyers report a positive experience in this category.
  • 3 / 5 Average. Meets the mainstream UK standard for the category. Net signal is roughly neutral.
  • 2 / 5 Below average. Net-negative reviewer commentary or a construction type with below-average benchmark performance.
  • 1 / 5 Poor. Strongly net-negative signal. A category the evidence says to be cautious about.

Where the review sample is too small to produce a reliable signal, the category is marked “not scored” and excluded from the composite mean. The evidence panel on the product page states the exact signal count and the threshold required to generate a score.

Where the data comes from

Every score traces back to one or more of the following sources. We make no claims that exceed the evidence the model has.

  • Verified customer reviews — keyword analysis. Aggregated review text from the retailer’s own product listing. The model scans each review for a curated list of category-specific keywords. Positive and negative keyword counts across the full review sample determine the net signal percentage that maps to the five-point rubric. Drives comfort, heat retention, value, and (where reviewer commentary is sufficient) edge support and motion isolation.
  • Verified customer reviews — rating aggregate. The raw 1-to-5 star rating on every active review for the product, averaged across the full sample and rounded to one decimal place. Drives the customer sentiment category. The minimum sample is twenty reviews; below that the category is marked not scored.
  • Manufacturer specification. Construction type (pocket spring, hybrid, memory foam, latex, open-coil), declared materials, firmness label, and dimension data. Used as a fallback signal for edge support and motion isolation where keyword evidence is limited.
  • Retailer policy data. Published trial nights, manufacturer warranty term, and return policy on file for the retailer that lists the product. Drives the retailer guarantee category. Where no published policy is on file for the retailer, the category is marked not scored.

Conflict-of-interest disclosure

BedBoy is funded entirely by affiliate commission. When you click through one of our outbound links and buy a product, the retailer pays us a small percentage of the sale. Neither the existence of an affiliate relationship nor the level of the commission affects the BedBoy Score.

Brands cannot pay to appear on BedBoy. Brands cannot pay to be removed. Brands cannot vet or revise our scoring before it goes live. Where a page is paid placement (a sponsored post or advertorial), it is clearly labelled “Sponsored” at the top and bottom and is kept out of the scored programme entirely.

For the full breakdown of how affiliate links work and the networks we use, see our affiliate disclaimer.

Refresh cadence

The keyword signal extraction that drives the BedBoy Score runs automatically on a nightly cycle across the full active catalogue. As new verified-purchase reviews arrive on the retailer sites we import, the review pool grows and the signal counts update. A product that accumulates enough new reviews overnight to cross a scoring threshold will have an updated score the following morning.

Manufacturer specification data (construction type, materials, firmness label) is updated when we become aware of a change. Specification changes are more infrequent than review volume changes but have a direct effect on the construction-type fallback signals for edge support and motion isolation.

If a product is discontinued or withdrawn from the market we mark the page accordingly and stop refreshing the score.

Corrections

Factual errors get corrected at the top of the page, with the date of the correction. If a score moves up or down following a refresh, the page records both the new score and the date of the change. We don’t silently rewrite history.

If you spot something that looks wrong, email [email protected] with a link to the page.

Frequently asked questions

What is the BedBoy Score?
The BedBoy Score is a composite rating on a 0 to 5 scale, calculated as the unweighted mean of every scored category for that product. Seven categories are scored: comfort, edge support, motion isolation, heat retention, value, retailer guarantee, and customer sentiment. The first five are derived from keyword signal analysis of verified-purchase reviews, with construction-type benchmarks as a fallback for categories where reviewer commentary is limited. Retailer guarantee is derived from published retailer policy (trial length, warranty term, return cover). Customer sentiment is the raw mean star rating across the verified-purchase review sample. The per-category breakdown on every product page shows the exact signal counts and derivation method behind each figure.
Does BedBoy physically test every mattress on the site?
No. BedBoy is a comparison engine. Every score on the site is derived from verified-purchase review signals, manufacturer specification data, retailer policy data, and aggregated customer rating data rather than hands-on testing. The model processes hundreds of reviews per product to extract keyword signals for each scoring category. The evidence panel visible on every product page shows exactly how each score was derived so you can judge the quality of the evidence for yourself.
How often is each BedBoy Score refreshed?
The keyword signal extraction, retailer guarantee pass, and customer sentiment pass all run nightly across the full catalogue. As new reviews arrive, the signal counts and the average star rating update and scores can move. A product that crosses a scoring threshold overnight because of new review volume will have an updated score by the following morning. Construction-type specification data and retailer policy data are updated whenever we become aware of a change.
Where does the review data come from?
Verified-purchase reviews from the retailer's own product listing. For the five keyword-driven categories the model scans each review for a curated list of category-specific positive and negative keywords; the net signal as a percentage of the review sample maps to a five-point score against a published rubric. For customer sentiment the raw 1-to-5 star rating on each active review is averaged across the full sample. Where the review sample is too small to produce a reliable signal the category is marked "not scored" and excluded from the composite.
How is the retailer guarantee category scored?
Retailer guarantee is the average of three equally-weighted sub-scores. Trial length: 5/5 at 100+ nights, 4/5 at 60-99, 3/5 at 30-59, 2/5 at 14-29, 1/5 below 14. Warranty term: 5/5 at 10+ years, 4/5 at 5-9, 3/5 at 2-4, 2/5 at 1 year, 1/5 below. Return cover: free comfort exchange scores highest, free returns next, paid collection mid-band, no returns lowest. The evidence panel on every product page lists the exact trial length, warranty term, and return policy used in the calculation. Where no published policy is on file for the retailer, the category is marked not scored.
How is the customer sentiment category scored?
Customer sentiment is the raw mean star rating across every active verified-purchase review on the retailer’s product listing, on a 1-to-5 scale, rounded to one decimal place. The threshold for scoring is a minimum of twenty reviews; below that the category is marked not scored and excluded from the composite. This category sits alongside the keyword-driven comfort score so the headline signal of "what buyers actually rated this product" is visible separately from the more granular per-attribute commentary.
Does affiliate commission affect the BedBoy Score?
No. BedBoy is funded by affiliate commission on outbound retailer links, but neither the existence of an affiliate relationship nor the level of the commission affects the score. Brands cannot pay to appear on BedBoy, cannot pay to be removed, and cannot vet or revise the score before it goes live. Sponsored content is labelled and kept out of the scored programme entirely.
Why does a product show "not scored" for some categories?
A category is marked "not scored" when the underlying evidence is too thin to produce a reliable figure. For comfort, value, and heat retention that usually means the review sample is below the minimum signal count. For edge support and motion isolation it means both the keyword signal and the construction-type fallback are unavailable. For retailer guarantee it means we have no published policy on file for the retailer. For customer sentiment it means fewer than twenty reviews. We exclude these categories from the composite rather than assign a default score that would mislead the comparison.