treeru.com

The Truth About PageSpeed Scores — What You Need to Know Before Measuring

You run PageSpeed Insights, hit refresh, and get a different score. Yesterday it was 88, today it's 85. You haven't changed a single line of code. This is completely normal. Before chasing higher PageSpeed scores, you need to understand how scoring works, why it fluctuates, and where the point of diminishing returns begins. Only then can you distinguish meaningful optimization from pointless obsession.

Same Site, Different Scores — Why ±3-5 Point Fluctuation Is Normal

Measure the same site three times consecutively and you'll see results like this:

RunMobileDesktopLCP
1st88973.4s
2nd87983.4s
3rd87983.4s

Not a single line of code changed, yet the Mobile score bounces between 87 and 88. This is inherent to how Lighthouse works.

Why scores fluctuate:

Network variability: Network conditions between the test server and measurement server differ with every run.

Server response time: Even a ±50ms change in TTFB (Time to First Byte) shifts the score.

CDN cache state: The first measurement may hit a cold cache while subsequent ones benefit from a warm cache.

JavaScript execution time: The same code runs with microsecond-level differences depending on execution environment conditions.

Don't read into a 1-point difference. The gap between 87 and 88 falls within the measurement error margin. To judge meaningful change, look for at least a 5-point difference, or compare averages of 3-5 measurements taken at the same time of day.

Mobile vs Desktop — Why the 10+ Point Gap Exists

"Desktop scores 97, but Mobile is only 88 — same site, why?" This is one of the most common questions. The answer lies in how Lighthouse simulates each environment.

SettingMobileDesktop
Simulated deviceMoto G PowerStandard desktop
CPU throttling4x slowdownNo throttling
Network throttling4G simulationNo throttling
Viewport412 x 8231350 x 940

Mobile testing applies 4x CPU slowdown and simulated 4G network constraints. This is significantly slower than real modern smartphones. The Moto G Power is a budget device released in 2019.

Why such a slow device? Google's intent is to ensure "acceptable experiences even on the slowest user devices." Budget Android phones represent a significant global user base. However, in markets with high flagship device adoption (like South Korea with iPhones and Galaxy flagships), real user experience is substantially better than what the Mobile score suggests.

Lighthouse Scoring Weights — Where Your Optimization Effort Matters Most

The PageSpeed Performance score is a weighted average of five metrics. Understanding each metric's contribution tells you exactly where to focus optimization effort.

MetricWeightWhat It MeasuresGood Threshold
TBT (Total Blocking Time)30%Main thread blocking duration< 200ms
LCP (Largest Contentful Paint)25%Largest content element render time< 2.5s
CLS (Cumulative Layout Shift)25%Visual stability / layout shifts< 0.1
FCP (First Contentful Paint)10%First content render time< 1.8s
SI (Speed Index)10%Visual completeness speed< 3.4s

Key takeaways from the weights:

TBT dominates at 30%. JavaScript execution time has the single largest impact on scores. However, with Next.js runtime JS (~14KB) being non-negotiable, improvement room is limited for framework users.

LCP + CLS = 50%. Nail these two and you've secured half the total score. Image optimization and layout stability are the highest-ROI optimizations.

FCP + SI = 20%. With relatively low weight, improvements to these metrics produce smaller score gains.

The 90 vs 100 Divide — Diminishing Returns in PageSpeed Optimization

PageSpeed optimization follows a steep diminishing returns curve. The higher your score climbs, the more effort each additional point demands.

Score RangeDifficultyTypical Work RequiredUser-Perceived Impact
0–50EasyImage compression, basic configVery high
50–80ModerateCLS fixes, font loading, CSS optimizationHigh
80–90HardLCP tuning, JS optimization, preload strategyModerate
90–100Very hardMicro-optimizations, architectural changesNegligible

Going from 38 to 88 required just four optimization phases: images, CLS, fonts, and LCP. But pushing from 88 to 95 requires touching fundamentally hard-to-control areas — JavaScript runtime size, third-party scripts, and server response time.

100 is not a realistic target. Google's own sites (google.com, youtube.com) don't hit Mobile 100. Achieving a perfect score requires a nearly JavaScript-free static HTML page. 90+ is "excellent" and 80+ is "good enough" for any production web application.

What Matters More Than the Score — Lab Data vs Field Data

The PageSpeed score is Lab Data — a synthetic simulation. Actual user experience is measured through Field Data, also known as Chrome User Experience Report (CrUX) data.

AttributeLab DataField Data (CrUX)
Data sourceLighthouse simulationReal Chrome users
Time frameInstant (right now)Rolling 28-day aggregate
DevicesFixed (Moto G)Diverse real user devices
SEO impactIndirectCore Web Vitals ranking signal

What affects Google search rankings is not the Lab Data score but Field Data Core Web Vitals. The "Discover what your real users are experiencing" section at the top of PageSpeed results shows this Field Data.

No Field Data available? Low-traffic sites display "The Chrome User Experience Report does not have sufficient real-world speed data for this page." In this case, use Lab Data as a reference but don't obsess over the score. Once enough traffic accumulates and Field Data appears, shift your optimization focus to Core Web Vitals metrics.

Measurement Automation — API-Driven Performance Tracking

Manually visiting pagespeed.web.dev every time is tedious and inconsistent. The PageSpeed Insights API enables automated measurement and recording.

# PageSpeed Insights API call (free)
# Requires an API key from Google Cloud Console

curl "https://www.googleapis.com/pagespeedonline/v5/runPagespeed\
?url=https://example.com\
&strategy=mobile\
&key=YOUR_API_KEY"

# Extract key metrics from the response
# .lighthouseResult.categories.performance.score  → Score
# .lighthouseResult.audits.largest-contentful-paint → LCP
# .lighthouseResult.audits.cumulative-layout-shift  → CLS
# .lighthouseResult.audits.total-blocking-time      → TBT

Add this to a cron job or CI/CD pipeline for automatic post-deployment performance checks. You can even set up alerts when scores drop below a defined threshold.

Measurement best practices:

Measure at consistent times. Server load varies throughout the day — early morning and afternoon scores can differ.

Average 3-5 runs. A single measurement is unreliable. Averages provide far more trustworthy data.

Compare before and after changes. Relative differences matter more than absolute scores.

Optimize for Mobile first. If Mobile scores are good, Desktop will naturally follow.

Summary

PageSpeed scores fluctuate ±3-5 points naturally — don't react to 1-point differences. Meaningful changes require at least a 5-point gap or averaged measurements.

Mobile simulates a 2019 budget phone on 4G — the Moto G Power with 4x CPU throttling. Real users on modern devices experience significantly better performance than Mobile scores suggest.

TBT (30%), LCP (25%), and CLS (25%) control 80% of the score. Focus optimization effort on JavaScript execution time, largest content paint, and layout stability for maximum impact.

After 90, diminishing returns hit hard. Going from 90 to 100 requires disproportionate effort with negligible user-perceived improvement. Even Google's own sites don't achieve Mobile 100.

Field Data (CrUX) matters more than Lab Data for SEO. Google search rankings use Core Web Vitals from real user data, not Lighthouse simulation scores. Prioritize Field Data metrics once your site has sufficient traffic.

Automate measurement with the PageSpeed Insights API. Use 3-5 run averages, measure at consistent times, and focus on before/after comparisons rather than absolute numbers.

These figures are based on Lighthouse 12. Weights and measurement methods may change with future versions. Measured data represents specific site results and may differ from general benchmarks.