Key takeaway

Want the short version? Skip down for a concise summary.

This website consistently scores near-perfect on Lighthouse desktop: performance in the high 90s, a perfect 100 on Best Practices. Those numbers did not arrive by accident, and they are not the result of optimizing a blank HTML page. They reflect a real production site with a custom font, a Google Analytics integration, a streaming AI assistant, dozens of images served in multiple formats, and server-side rendering powering every page.

Every optimization described in this article is live on this site right now. This is not a checklist of things you could do. It is the full account of exactly what we built, why each decision was made, and how they work together as a system.

What Lighthouse Actually Measures (and Why Near-Perfect Is Hard to Earn)

Lighthouse is not one test. It is a set of independently scored dimensions: Performance, Accessibility, Best Practices, and SEO. Each produces a 0-100 score, and each draws from its own set of signals. Doing well in one does not help you in another, and regressions in any of them appear independently.

The Performance score is weighted across five metrics. Three of them are Core Web Vitals, the set of signals Google uses to evaluate real-world page experience.

LCP (Largest Contentful Paint). How long until the largest visible element renders. For most pages, that is the hero image or the primary heading. Target: under 2.5 seconds.
CLS (Cumulative Layout Shift). The cumulative score for unexpected visual movement as the page loads. An image that loads and pushes content down, a font that swaps and reflows text. Target: under 0.1.
INP (Interaction to Next Paint). The responsiveness of the page to user input. How long until the browser visually acknowledges a click, tap, or keypress. Target: under 200 milliseconds.
FCP (First Contentful Paint). How long until the browser renders any content at all. A proxy for how fast the server responds and the browser starts parsing. Target: under 1.8 seconds.
TTFB (Time to First Byte). The time from request to the first byte of the HTML response. Captures server latency, network distance, and CDN effectiveness.

The 100 Best Practices score is a separate test covering security and hygiene: HTTPS on all resources, no deprecated browser APIs, no mixed content, images served at correct aspect ratios, no console errors in production. These are independently audited from performance and need their own attention.

Getting all of these into the green simultaneously on a real site, with third-party scripts, dynamic images, lazy-loaded components, and a streaming AI widget, requires every layer of the stack to be built correctly. A single careless image, an unmanaged third-party script, or a missing cache header is enough to pull a metric out of range.

Core Web Vitals: Score Map

LCPLargest Contentful Paint

< 2.5 s

Good (under 2.5 s)

Priority image loading (fetchPriority="high")
WOFF2 font preload in document head
SSR: content in first HTML byte

CLSCumulative Layout Shift

< 0.1

Good (zero shift)

Explicit width/height on every image
picture element holds slot before format resolves
Inline theme script prevents color-scheme flash

INPInteraction to Next Paint

< 200 ms

Good (under 200 ms)

Heavy components lazy-loaded off critical path
Analytics and toasts idle-mounted
Manual chunk splitting prevents main-thread jank

FCPFirst Contentful Paint

< 1.8 s

Good (under 1.8 s)

SSR eliminates JS-hydration gap
Async gtag.js never blocks paint
Brotli-compressed assets reduce transfer size

The diagram above maps each metric to the specific engineering decisions that most directly affect it. The sections that follow walk through each of those decisions in detail.

LCP: Making the Largest Element Arrive Fast

LCP is the metric most directly tied to perceived load speed. When a visitor opens a page and something meaningful appears quickly, the experience feels fast. When it does not, nothing else about the site matters. Every decision in this section is oriented around getting that first large element onto the screen as fast as possible.

Server-side rendering eliminates the hydration gap

The site runs on TanStack Start with Nitro, which serves fully rendered HTML for every page. The browser does not receive an empty shell and wait for JavaScript to hydrate the content before anything becomes visible. The content is in the HTML response, immediately parseable and paintable. This directly improves both FCP and LCP because the browser can start rendering content the moment the first bytes arrive.

Priority images skip the queue

Every hero image in the codebase accepts a priority prop. When set, the image renders with fetchPriority="high" and the fade-in animation is skipped entirely. This tells the browser to fetch the image immediately, ahead of other resources, and to display it as soon as it arrives rather than holding it in a CSS transition. The image that will become the LCP element is never waiting behind lower-priority network requests.

The font is preloaded in the document head

The site uses a single variable font in WOFF2 format. A rel="preload" as="font" crossOrigin="anonymous" link is injected into the document <head> via TanStack Router's head() API. This starts the font download in parallel with the HTML parse, before the browser has encountered any CSS that references the font. Without this preload, the browser discovers the font reference only when it processes the stylesheet, which can be hundreds of milliseconds later. With it, the font is typically ready before it is needed.

Third-party scripts never block the first paint

Google Analytics loads via an async script tag, which means the browser continues parsing and rendering without waiting for it. A preconnect hint warms the connection to googletagmanager.com in the background, and a dns-prefetch hint does the same for Firestore and the xAI API used by the Ask widget. These hints fire as the document loads, so the DNS lookup and TCP handshake are already complete by the time the scripts actually need those connections.

Non-visual components like Analytics, the toast system, and the app version monitor are wrapped in IdleMount, a component that uses requestIdleCallback to defer rendering until after the browser has completed its first paint and is no longer busy with critical work. These components are completely invisible to Lighthouse because they are not in the DOM when the performance metrics are captured.

CLS: Zero Layout Shift by Design

Cumulative Layout Shift is one of the most counterintuitive metrics because it is invisible until something goes wrong. A well-built page has a CLS of exactly zero: nothing moves, nothing jumps, every element occupies the correct space from the moment the page starts painting. Achieving CLS of zero is not a tuning exercise. It is an architectural discipline.

Explicit dimensions on every image

Every <img> element in the codebase carries explicit width and height attributes. This allows the browser to calculate and reserve the correct amount of space for the image before the image data has arrived. Without these attributes, the browser renders a zero-height placeholder, then expands it when the image loads, pushing all surrounding content downward. That movement is directly measured as CLS.

The picture element as a format-switch guard

Images are served using a custom PictureFadeImage component built on the HTML <picture> element. The component offers a WebP <source> with a PNG or JPEG fallback. The browser selects the correct format at parse time and commits to the reserved dimensions before any bytes are downloaded. There is no JavaScript involved in format selection, no runtime decision that could delay the layout reservation, and no reflow when the format resolves.

Skeleton placeholders for async-loaded content

Article diagrams are lazy-loaded React components wrapped in Suspense. While the diagram component is downloading, a DiagramSkeleton placeholder occupies its space in the layout. The skeleton uses the same minimum height as the loaded diagram, so when the component mounts there is no jump. The space is already reserved.

No flash of unstyled content

The site supports light and dark mode. Without careful handling, the theme can change between the server-rendered HTML and the hydrated React tree, causing a visible color-scheme jump. A small inline script runs before the rest of the document parses. It reads the user's stored preference, sets data-js-reveal on the <html> element, and establishes the correct theme class immediately. By the time the browser renders anything, the theme is already correct. There is no layout shift and no color flash.

CLS Prevention: Source vs. Fix

Image Layout Shift

Images expand on load, pushing content down

Structural fix

Explicit width/height reserves the slot before bytes arrive; picture element commits to dimensions at parse time

Async Content Shift

Lazy components mount and push surrounding content

Placeholder fix

Suspense skeletons hold the exact space before the component loads; no movement when it mounts

Style / Theme Shift

Late-loading styles or theme changes cause color-scheme reflow

Inline script fix

Inline theme init script sets the correct class before hydration; Tailwind JIT emits no runtime styles

The diagram above shows the three categories of CLS sources and the specific countermeasure for each. Every one of these techniques is structural: the protection is built into the component and layout system, not applied after the fact.

JavaScript Strategy: Ship Less, Load the Rest Later

JavaScript is the most expensive resource a browser processes. It blocks the main thread during parse, compilation, and execution. Every kilobyte of JavaScript shipped on the critical path is a direct cost paid on every page load by every visitor. The fastest JavaScript is the JavaScript that does not run during the initial load.

Manual chunk splitting in Vite

The Vite build configuration defines manual chunk boundaries for the bundles that would otherwise be merged into a large single file. React and react-dom live in their own chunk. TanStack Form, lucide-react icons, and the Sonner toast library each get their own chunk. The reason is deliberate: route-level code splitting only works when shared dependencies do not pull everything into the initial bundle. Separating React into its own chunk also prevents a race condition where Radix UI or TanStack Router would otherwise trigger React to load multiple times from different chunk boundaries.

lazy() and Suspense for non-critical components

Several of the heaviest components in the app are never needed for the initial paint. They are loaded lazily using React's built-in lazy() function and wrapped in Suspense boundaries:

LazyAskAssistant. The entire AI chat widget, including its TanStack AI dependencies, is excluded from the initial bundle.
LazyToaster. The Sonner toast notification system is not needed until a notification fires.
LazyAppVersionMonitor. The component that polls for new deployments and prompts users to refresh runs entirely in the background.
LazyAnalytics. The GA4 Analytics component and all related event tracking code is excluded from the server-side bundle entirely and loads only in production.
22+ ArticleDiagram components. Every diagram in the article system is a separate lazy-loaded module. A visitor reading an article with no diagrams downloads zero diagram code.

Idle mounting for background work

The IdleMount component wraps the entire group of non-visual background components (Analytics, the toast system, the version monitor). It uses requestIdleCallback with a setTimeout fallback for browsers that do not support it. These components do not mount until the browser reports that the main thread has spare capacity. They are invisible to Lighthouse, irrelevant to FCP and LCP, and never compete with the user for paint time.

Intersection Observer for below-fold features

The Ask AI section mounts its LazyAskChat component only when the section enters the viewport, detected by an Intersection Observer with a 280px root margin. A visitor who never scrolls to the Ask section never downloads the AI chat code at all. The observer disconnects after the first intersection, so it does not remain active on the page after the component has loaded.

JavaScript Loading Strategy

Tier 1: Render-Critical

Always present. Never deferred. Determines FCP and LCP.

SSR-rendered HTMLFont WOFF2 preloadInline theme init scriptCritical CSS chunk

Tier 2: On-Demand

Loads when the user needs it: on scroll, on interaction, or on route change.

LazyAskAssistant (on intersection)22+ ArticleDiagram components (on render)Route-level code chunks (on navigation)

Tier 3: Idle

Loads only when the browser reports spare main-thread capacity via requestIdleCallback.

LazyAnalytics (production only)LazyToaster (notification system)LazyAppVersionMonitor (background polling)

The diagram above shows how JavaScript is categorized into three loading tiers. Each tier has a clear rule: render-critical code is always present, on-demand code loads when needed, and idle code loads only when the browser has nothing more important to do.

The Asset Pipeline: Images, WebP, and Build-Time Optimization

Images are the most common source of performance regressions on content sites. They are large, they require format negotiation, and they are often loaded at the wrong time. Every image on this site goes through a deliberate pipeline before it reaches the browser.

Build-time WebP generation

A prebuild script processes every PNG and JPEG in the public/ directory and generates a WebP sibling alongside each one. Quality levels are calibrated per source format: 92 for PNG-sourced WebP, 86 for JPEG-sourced WebP, and 93 for logo PNGs where crispness matters more than compression. The script skips OG images, which must stay as PNG for social platform compatibility. This runs automatically as part of npm run build and uses a hash-based cache so only changed files are reprocessed.

Serving images with the picture element

Every content image uses the PictureFadeImage component, which renders a <picture> element with a WebP <source> and a PNG or JPEG <img> fallback. The browser selects the best format at parse time with no JavaScript. WebP images are typically 25 to 35 percent smaller than their PNG equivalents at equivalent visual quality, which translates directly into faster downloads and a lower LCP time.

Preloading above-fold images

The homepage opens with a scrolling marquee of client logos. The first group of logos is visible immediately, making them LCP candidates. The root layout injects a rel="preload" as="image" type="image/webp" link for each of those first logos. The browser starts fetching them while still parsing the document head, before it has even encountered the marquee component in the body. On high-resolution displays, retina variants are preloaded instead.

Selective lazy loading

Above-fold logos use loading="eager" to prevent the browser from deferring them. Every other image uses loading="lazy", which prevents the browser from fetching images that are not yet in the viewport. On a long news article or case study page, this can mean dozens of images are never fetched until the visitor reaches them. No wasted bandwidth, no unnecessary competition for the network during initial load.

Asset hashing and immutable cache headers

Every JavaScript and CSS file produced by Vite has a content hash in its filename, for example /assets/index.a4f2c81b.js. The server sets Cache-Control: public, max-age=31536000, immutable on all assets under /assets/. Browsers cache these files for one full year and never revalidate them. Because the hash changes any time the file changes, cache invalidation is automatic and correct.

Brotli and Gzip compression

The Nitro build configuration enables dual compression: compressPublicAssets: { gzip: true, brotli: true }. Static assets are pre-compressed at build time and stored as .gz and .br variants alongside the originals. The server selects the best format based on the request's Accept-Encoding header. Brotli typically achieves 15 to 25 percent better compression than Gzip on JavaScript files.

Caching and Headers: Staying Fast on Every Return Visit

A fast first load is necessary but not sufficient. Caching determines how fast every return visit is, and the wrong cache policy on any resource type is a direct performance regression. The caching strategy on this site is implemented in a Nitro middleware layer and uses four distinct policies based on resource type.

Three tiers of cache policy

Hashed static assets: immutable. JavaScript, CSS, fonts, and other files under /assets/ carry content-hash filenames. Cache-Control is set to public, max-age=31536000, immutable. The browser caches them indefinitely and never sends a revalidation request.
Public media: one week. Images, logos, case study photos, news thumbnails, and OG images under directories like /logos/ and /og-image/ are cached for max-age=604800. These change less frequently than page content but are not immutable.
HTML pages: no-cache. Dynamic routes use no-cache, which tells the browser to always revalidate before using a cached copy. The content is current on every visit, but the browser can still use a conditional GET and receive a 304 Not Modified if nothing has changed.
API routes: no-store. The /api/ask streaming endpoint and other API routes use no-store, which prevents caching entirely. Responses are always live.

The cache correction plugin

Azure App Service, the hosting platform this site runs on, can stamp an immutable directive onto responses before Nitro middleware has a chance to set the correct headers. A custom Nitro response hook runs on every outgoing response and removes any incorrect immutable flag from HTML routes. Without this plugin, a visitor's browser would cache a page indefinitely, and content updates would be invisible until the cache expired. This is the kind of infrastructure edge case that only surfaces in production on a specific hosting platform, and it is exactly the kind of thing that a thorough cache strategy must account for.

Security headers and the 100 Best Practices score

Security headers are set by a dedicated Nitro middleware layer that runs on every request: Content Security Policy, HTTP Strict Transport Security (HSTS), X-Frame-Options, X-Content-Type-Options, and Referrer-Policy. These headers protect visitors and also contribute directly to the Lighthouse 100 Best Practices score. The CSP is scoped to allow only known script sources and connection targets, including googletagmanager.com for analytics and api.x.ai for the Ask widget.

The live PageSpeed cache

The engineering status page on this site calls the Google PageSpeed Insights API and displays the live Lighthouse scores. To avoid adding latency to that page request, the result is cached in Firestore with a configurable TTL (defaulting to six hours). If the PageSpeed API returns an error, the server falls back to the last successfully cached result and marks the response as stale. Visitors always see a score, and no visitor ever waits on an upstream API call.

Performance Is a Product Decision

A near-perfect Lighthouse score is not the result of running a performance audit once and applying the recommendations. It is the outcome of building every layer of the stack with performance in mind from the start: the build pipeline, the serving infrastructure, the JavaScript loading strategy, and the image delivery system.

The score holds because the decisions are structural. Asset hashing means cache headers are permanent and safe. Chunk splitting means each route loads a minimal bundle. Explicit image dimensions mean CLS is zero by construction, not by luck. Idle-mounted components mean non-visual work never competes with the paint the user is waiting for.

We build these same practices into every client project. If you are building or rebuilding a web application and want performance engineered in from the start rather than chased after launch, this is the kind of work we do.

Tagged:CLS Core Web Vitals FCP Image Optimization INP LCP Lighthouse Nitro Performance React SSR Tailwind TanStack Start TypeScript Vite WebP

Work With Us

Have a project in mind?

We build the web's most demanding applications. Let's talk about yours.

Get in Touch

Built for Speed: The Engineering Behind Our Near-Perfect Lighthouse Performance Score