How we score mattresses and beds
Methodology v1.3, last updated 14 May 2026. View changelog.
BedBoy is a UK affiliate comparison engine, not a test lab. Every product on the site carries a BedBoy Score: a composite built from keyword analysis of customer reviews on the retailer’s own listing, construction-type benchmarks, retailer policy data, and aggregated star-rating data. A small subset of pages also carry a tier badge (Home tested or Store tested) where an editor has handled or slept on the product; most pages carry no badge and are scored from review and specification data alone, and the page states which method was used. This page explains how each input is measured, where the data comes from, and where the model has limits.
What the BedBoy Score is
The headline BedBoy Score visible at the top of every product page is the unweighted mean of every scored category for that product, on a five-point scale, rounded to one decimal place. A 4.2 means the product averages 4.2 out of 5 across the categories we can score. The per-category breakdown on the same page shows exactly where the points came from, and where the evidence behind each figure was drawn from.
The score is an algorithm applied consistently across hundreds of products so the comparison stays fair. It is not the opinion of a single tester and it does not depend on which reviewer happened to leave feedback on a given Tuesday. Every BedBoy Score page includes a per-category evidence panel that shows the signal counts and derivation method behind every figure.
The scoring categories
Categories are segmented by product type. Edge support and motion isolation are mattress concepts; assembly experience and storage are bed-frame concepts; washability is a bedding concept. A product is only scored against categories that actually apply to it. Two categories (retailer guarantee and customer sentiment) are universal and appear on every product.
Mattresses
- Comfort. Derived from keyword signals across the imported review pool. The model counts positive mentions (comfortable, supportive, good night’s sleep, and related phrases) against negative mentions (uncomfortable, too firm, too soft, and related phrases). The net signal as a percentage of the review sample determines the score against a published five-point rubric.
- Edge support. Where reviewer commentary on edge support is sufficient, the score follows the same keyword-signal method. Where it is sparse, the score falls back to a construction-type benchmark: pocket spring and hybrid builds score higher; open-coil and memory-foam builds score lower. The evidence panel on the product page states which method was used.
- Motion isolation. Same dual-source method as edge support. Reviewer commentary from couples and combination sleepers drives the score where available; construction type provides the fallback signal where it does not.
- Heat retention. Same keyword-signal method applied to temperature commentary in the review pool. Positive mentions (cool, breathable, temperature-neutral) versus negative mentions (hot, sweaty, night sweats) drive the score. Memory-foam-dominant products naturally attract more negative heat signals; airflow-engineered hybrids attract fewer. Gel-infused and cool-tech variants receive an uplift in the construction-type fallback.
- Value. Driven by reviewer commentary on price-against-experience. Positive mentions (worth the money, good value, would buy again) against negative mentions (overpriced, not worth it, poor quality for the price) determine the score. Price data from the retailer listing informs the background context but the reviewer signal is the primary input.
Bed frames
- Build quality. Sturdy construction; absence of creaks and wobble in real-world use. Keyword signals from owner reviews.
- Assembly experience. Whether the flat-pack went together easily, clarity of instructions, missing-parts incidents.
- Storage. For ottoman and divan-base frames: lift mechanism, hydraulic struts, usable storage volume. Marked “not scored” on frames with no storage.
- Style and finish. Appearance, fabric or upholstery quality, perceived value of the finished frame in the room.
- Mattress fit. Whether the buyer’s mattress sits cleanly inside the frame without gaps or sliding.
Headboards
- Build quality. Sturdy construction; rigid mounting; absence of wobble against the wall.
- Mounting and fit. Ease of attaching to a divan or fixing to the wall; correctness of supplied fittings; sizing for the bed.
- Cushioning. Padding depth and comfort for sitting up against the headboard.
- Style and finish. Appearance, fabric quality, perceived finish of the headboard in the room.
Bedding (duvets, toppers, throws, sets)
- Warmth. Whether the product delivers on its stated temperature promise: TOG ratings on duvets, season suitability on throws.
- Filling quality. Loft, plumpness, evenness of fill; resistance to clumping or flattening over time.
- Washability. Holds shape and colour through machine washing; shrinkage or fading issues.
- Materials. Fabric softness, breathability, perceived quality of cotton, microfibre, or down.
Kids beds
- Safety. Absence of pinch points, secure rails, no hazardous gaps or sharp edges.
- Sturdiness. Whether the frame stands up to jumping, climbing, and the general wear of children using the bed.
- Age fit. Size and design suited to the age range it was bought for.
- Style. Appeal to the child; visual fit with their bedroom.
Universal categories (every product type)
- Retailer guarantee. Derived from the published trial period, manufacturer warranty term, and return policy of the retailer that lists the product. The score is the average of three equally-weighted sub-scores: trial length (5/5 at 100+ nights, 4/5 at 60-99, 3/5 at 30-59, 2/5 at 14-29, 1/5 below 14); warranty term (5/5 at 10+ years, 4/5 at 5-9, 3/5 at 2-4, 2/5 at 1 year, 1/5 below); and return cover (free comfort exchange scores highest, free returns next, paid collection mid-band, no returns lowest). The evidence panel lists the exact policy terms used.
- Customer sentiment. The raw mean star rating across every active owner review for the product, on the retailer’s native 1-to-5 scale, rounded to one decimal place. Requires at least twenty reviews; below that threshold the category is marked not scored and excluded from the composite. This category is deliberately separate from any keyword-driven category so the headline signal of “what buyers actually rated this product overall” sits alongside the more granular keyword-based categories.
The 0 to 5 scoring scale
Every category is scored on a five-point scale. We use whole and half points only.
- 5 / 5 Excellent. Strongly net-positive reviewer commentary or best-in-class construction benchmark.
- 4 / 5 Good. Above-average net signal. Most buyers report a positive experience in this category.
- 3 / 5 Average. Meets the mainstream UK standard for the category. Net signal is roughly neutral.
- 2 / 5 Below average. Net-negative reviewer commentary or a construction type with below-average benchmark performance.
- 1 / 5 Poor. Strongly net-negative signal. A category the evidence says to be cautious about.
Where the review sample is too small to produce a reliable signal, the category is marked “not scored” and excluded from the composite mean. The evidence panel on the product page states the exact signal count and the threshold required to generate a score.
Where the data comes from
Every score traces back to one or more of the following sources. We make no claims that exceed the evidence the model has.
- Customer review text (keyword analysis). Aggregated review text from the retailer’s own product listing. The model scans each review for a curated list of category-specific keywords. Positive and negative keyword counts across the full sample determine the net signal percentage that maps to the five-point rubric. Drives comfort, heat retention, value, and (where reviewer commentary is sufficient) edge support and motion isolation.
- Customer star ratings (aggregate). The raw 1-to-5 star rating on every active owner review for the product, averaged across the full sample and rounded to one decimal place. Drives the customer sentiment category. The minimum sample is twenty reviews; below that the category is marked not scored.
- Manufacturer specification. Construction type (pocket spring, hybrid, memory foam, latex, open-coil), declared materials, firmness label, and dimension data. Used as a fallback signal for edge support and motion isolation where keyword evidence is limited.
- Retailer policy data. Published trial nights, manufacturer warranty term, and return policy on file for the retailer that lists the product. Drives the retailer guarantee category. Where no published policy is on file for the retailer, the category is marked not scored.
Tier badges (Home tested, Store tested, no badge)
A small minority of pages carry a tier badge that signals first-party editor exposure to the product. The badge sits next to the BedBoy Score and is independent of the score itself. A Home tested product and a no-badge product on the same page template are scored against the same rubric using the same evidence.
- Home tested. A named editor has bought the product, slept on it for at least 30 nights at home, and recorded structured notes against the seven categories. Their notes feed an editor commentary block on the product page but do not override the algorithmic score.
- Store tested. A named editor has handled the product in a UK retail showroom (lay-tested the mattress, opened the frame drawer, inspected the headboard fabric) and recorded structured showroom notes. Shorter exposure than Home tested, so the editor commentary block is briefer and explicitly labelled as showroom-only.
- No badge. The default state for the majority of products on the site. The page is scored from imported customer review signals, manufacturer specification, retailer policy data, and aggregated star-rating data only. No editor has slept on or handled the product. The score uses the same rubric as a badged product.
No badge equals no editorial overlay, not a lower score. The badge is a transparency mechanism, not a quality mechanism: it tells you how much first-party context sits behind the page, separately from what the data says about the product.
Conflict-of-interest disclosure
BedBoy is funded entirely by affiliate commission. When you click through one of our outbound links and buy a product, the retailer pays us a small percentage of the sale. Neither the existence of an affiliate relationship nor the level of the commission affects the BedBoy Score.
Brands cannot pay to appear on BedBoy. Brands cannot pay to be removed. Brands cannot vet or revise our scoring before it goes live. Where a page is paid placement (a sponsored post or advertorial), it is labelled “Sponsored” at the top and bottom and is kept out of the scored programme entirely.
For the full breakdown of how affiliate links work and the networks we use, see our affiliate disclaimer.
Refresh cadence
The keyword signal extraction that drives the BedBoy Score runs automatically on a nightly cycle across the full active catalogue. As fresh customer reviews land on the retailer sites we import, the review pool grows and the signal counts update. A product that accumulates enough new reviews overnight to cross a scoring threshold will have an updated score the following morning.
Manufacturer specification data (construction type, materials, firmness label) is updated when we become aware of a change. Specification changes are more infrequent than review volume changes but have a direct effect on the construction-type fallback signals for edge support and motion isolation.
If a product is discontinued or withdrawn from the market we mark the page accordingly and stop refreshing the score.
Corrections
Factual errors get corrected at the top of the page, with the date of the correction. If a score moves up or down following a refresh, the page records both the new score and the date of the change. We don’t silently rewrite history.
If you spot something that looks wrong, email [email protected] with a link to the page.
Frequently asked questions
What is the BedBoy Score?
Does BedBoy physically test every mattress on the site?
How often is each BedBoy Score refreshed?
Where does the review data come from?
How is the retailer guarantee category scored?
How is the customer sentiment category scored?
Does affiliate commission affect the BedBoy Score?
Why does a product show "not scored" for some categories?
Methodology changelog
- v1.3, 14 May 2026. Added explicit Tier badge taxonomy (Home tested / Store tested / no badge) and softened “Hands-On” phrasing to reflect that most pages on the site are scored from review and specification data only. Added Speakable schema across the rubric and scale blocks for AI-assistant readability. Added this changelog.
- v1.2, 12 May 2026. Customer sentiment minimum-sample threshold raised from ten reviews to twenty. Retailer guarantee scoring split into three equally-weighted sub-scores (trial length, warranty term, return cover) with published thresholds.
- v1.1, 22 April 2026. Heat-retention category added to the mattress rubric. Construction-type fallback formalised for edge support and motion isolation where keyword evidence is limited.
- v1.0, 2 April 2026. Initial published methodology: seven scoring categories, 0-to-5 five-point scale, nightly refresh cadence, conflict-of-interest disclosure.