Technology March 11, 2026

How CDNs Work

A 6-minute read

Netflix serves 700 million hours of video every day. Here's how content delivery networks make that possible, and why your website needs one.

When you stream a Netflix movie, the video doesn’t come from a single server in California. It comes from a server physically close to you: maybe in the same city. The same is true when you load a webpage, download an app, or play an online game. This magic is performed by a content delivery network, and without it, the modern internet would be impossibly slow.

The short answer

A CDN (Content Delivery Network) is a geographically distributed network of servers that stores copies of content (images, videos, scripts, webpages) and delivers it to users from the server closest to them. CDNs work on top of the internet’s routing infrastructure, leveraging physical proximity to reduce latency. By serving content from a nearby location rather than a single central server, CDNs reduce latency, handle traffic spikes, and make the web dramatically faster.

The full picture

The problem CDNs solve

Imagine a single web server in New York serving 10 million users worldwide. Every user, whether they’re in Tokyo, London, or São Paulo, must connect to that one server over thousands of miles of cables. The physics of light through fiber optic cable means a user in Japan might wait 200-300 milliseconds just for the data to travel one way.

At scale, this doesn’t just create delay. It creates collapse. When a single server tries to serve millions of people simultaneously, bandwidth gets exhausted, servers crash, and websites time out.

CDNs solve this by distributing the load. Instead of one server doing all the work, thousands of servers around the world each handle a portion of the traffic.

How a CDN works: the basics

When a website owner signs up with a CDN, the CDN’s servers create copies of the website’s static content, images, CSS files, JavaScript, videos. These are the files that don’t change often and don’t need to be generated dynamically.

Here’s what happens when you visit a site using a CDN:

  1. Your browser requests an image from cdn.example.com (or a similar subdomain).
  2. The CDN’s system, using anycast routing, directs your request to the server closest to your physical location.
  3. That “edge server” has the image cached and serves it directly.

The nearest server might be 50 miles away instead of 5,000. The difference in speed is dramatic. Studies consistently show that even a one-second delay in page load time can drop conversion rates by 7%.

Edge servers and points of presence

A CDN’s network consists of dozens or hundreds of “points of presence” (PoPs): data centers in cities around the world. Each PoP contains many servers dedicated to caching content.

The number of PoPs is one way to evaluate a CDN. A global CDN like Cloudflare has over 300 PoPs. Akamai has more than 4,000. More PoPs generally means better performance for more users.

These servers are typically placed in major internet exchange points: physical locations where different networks interconnect. This positioning ensures the CDN can quickly pull original content from the source server and quickly deliver it to end users.

Caching: the mechanism behind the speed

CDNs work because of caching. When content is first requested from a CDN, the edge server doesn’t have it. It fetches the original from the origin server, delivers it to the user, and keeps a copy.

On subsequent requests from users in that region, the edge server serves the cached copy directly. This is called a “cache hit.” The origin server never gets involved again until the cached version expires or gets evicted.

Caching works based on rules set by the website owner. Static assets like images might be cached for a week. JavaScript files might be cached for a year. Dynamic content might not be cached at all, since it changes per user.

This is why clearing your browser cache sometimes “fixes” websites. The CDN served an outdated version, your browser rejected it, but both had old copies. A cache clear forces fresh content to load.

Load balancing and DDoS protection

CDNs do more than speed. They also distribute traffic load and absorb attacks.

When a news story goes viral and 100,000 people try to visit a site in an hour, a single server would crash. A CDN spreads that traffic across thousands of servers worldwide. No single machine bears the full load.

This also makes sites harder to knock offline. Distributed Denial of Service (DDoS) attacks try to overwhelm a server with traffic. A CDN absorbs the attack traffic across its entire network, so the origin server never sees it. This is why CDN providers have become a first line of defense against cyberattacks.

Many CDNs also offer Web Application Firewalls (WAFs) that filter out malicious requests before they reach the origin server.

Types of content CDNs deliver

CDNs are commonly associated with video streaming: Netflix, YouTube, and Hulu rely heavily on them. But they handle far more:

  • Static web assets: Images, CSS, JavaScript, fonts. This is the most common use.
  • Software downloads: Operating system updates, mobile app updates, and installer files.
  • API responses: For high-traffic APIs, CDN caching reduces the load on origin servers.
  • Gaming assets: Game textures, levels, and updates delivered to consoles and PCs.

Almost every major website uses a CDN. If you’ve ever loaded a page and seen a domain like cdn.jsdelivr.net, cloudflare.com, or fastly.net in the network requests, that’s a CDN at work.

Pull zones vs. push zones

There are two main ways to populate a CDN’s cache:

Pull zones are the most common. The origin server hosts the original content. The first time a user requests something from the CDN, the edge server “pulls” it from the origin. This requires no extra work from the website owner. Content is cached on-demand.

Push zones require the website owner to proactively upload content to the CDN before users request it. This gives more control and works better for predictable, large-scale content like video libraries or major software releases.

Most websites use pull zones because they’re automatic.

From caching to computing: the rise of edge functions

CDNs started as file delivery systems. In 2025, the more interesting story is what’s happening when those edge servers run code, not just serve files.

The shift has a name: edge computing. Instead of running your application logic in a central data center and using the CDN only to cache the results, you can now run actual code on the CDN’s edge servers themselves: the same servers sitting in cities around the world. Your code executes close to the user, not thousands of miles away.

The platforms leading this are:

Cloudflare Workers: The most widely adopted edge compute platform. Workers let you write JavaScript (or Rust, Python, or other languages compiled to WebAssembly) that runs on Cloudflare’s 330+ global locations with sub-50ms latency. Cloudflare reached 3 million active developers on Workers by 2024, with 50% year-over-year growth. A practical example: instead of routing a user login request all the way to a server in Virginia, a Cloudflare Worker can check the authentication token at the edge: in Frankfurt, Sydney, or São Paulo, and respond in milliseconds.

Vercel Edge Functions: Vercel (the platform behind Next.js) routes certain application logic to edge locations automatically. When a Next.js app uses Edge Runtime, its middleware runs on servers close to each user, not on a single origin. This is why Next.js apps can feel snappy globally without complex multi-region infrastructure.

Fastly Compute: Fastly’s edge compute offering uses WebAssembly, which means it’s extremely fast and secure. Major media companies and e-commerce platforms use it to run request processing, A/B testing logic, and personalization at the edge, shaving off the round-trip to origin servers entirely.

Why does this matter? Consider A/B testing. Traditionally, a server in one location decides which variant to show each user, then fetches and returns the content. With edge computing, that decision happens in a data center that might be 10 miles from the user, not 5,000. The page loads faster, and the user never notices the experiment happening.

AI inference at the edge

The newest frontier is running AI models at the edge, directly on CDN infrastructure. This is significant because AI inference (getting an answer from a trained model) traditionally requires powerful centralized servers. The latency of sending a request across the ocean to get an AI response is noticeable.

Cloudflare Workers AI lets developers run models like Llama, Whisper (for speech-to-text), and image classifiers directly on edge servers, without provisioning a GPU cluster. Cloudflare reported that Workers AI inference requests grew nearly 4,000% year-over-year in 2025. The use cases include real-time content moderation, translation, and personalization that happens before content even reaches the user’s browser.

This is still early days. The models that run at the edge are smaller and less capable than the ones running on dedicated GPU clusters in cloud computing providers like AWS or Azure. But the pattern is clear: computation is moving outward, toward users, and CDNs are evolving into the infrastructure that makes that possible.

Multi-CDN strategies: why big platforms use more than one

If a single CDN is good, is two better? For large platforms, often yes.

A multi-CDN strategy means routing traffic through multiple CDN providers simultaneously, using different providers for different regions, content types, or as failover backup. Netflix operates its own CDN (Open Connect, with appliances installed directly inside ISPs) alongside third-party CDNs for traffic spikes it can’t absorb internally. Disney+ and Apple use multiple CDN providers to ensure that a regional outage at one provider doesn’t take down their service globally.

The main reasons companies go multi-CDN:

  1. Resilience: If one CDN has an outage, traffic fails over to another automatically. A CDN outage is rare but not unheard of, a few large CDN incidents have briefly taken down dozens of major websites simultaneously.
  2. Performance optimization: Different CDNs perform better in different regions. A load balancer (or a service like NS1 or Fastly’s Fanout) measures performance in real time and routes each user to whichever CDN is fastest for them right now.
  3. Cost leverage: Having multiple vendors gives companies negotiating power on rates and prevents over-dependence on any single provider.

For most websites, a single CDN is more than enough. Multi-CDN is a strategy for platforms whose traffic is so large, or whose uptime requirements so strict, that a single provider becomes a single point of failure.

Why it matters

If you run any kind of website, a CDN is one of the simplest performance upgrades you can make. Most CDN services are free for basic use, and the performance improvement is immediate.

For businesses, page speed directly affects revenue. Amazon found that every 100 milliseconds of latency cost them 1% in sales. Google uses page speed as a ranking signal. A slow website loses users, and a CDN is often the biggest single fix.

For users, CDNs make the web feel responsive. Without CDNs, loading a page with 50 images would require 50 round trips to a single server. With a CDN, those images load in parallel from nearby servers. The difference between a site that feels sluggish and one that feels instant is often a CDN.

Common misconceptions

“CDNs are only for big websites.” Not at all. A small blog can benefit enormously from a CDN. Most CDN providers have free tiers that handle thousands of requests per month. The performance boost is the same regardless of site size.

“CDNs are just for caching images.” They cache virtually any static file, HTML, CSS, JavaScript, fonts, PDFs, videos, and more. Modern CDNs can also accelerate API calls and even dynamic content.

“Moving to a CDN is complicated.” It’s usually one checkbox. Services like Cloudflare can be enabled by changing your domain’s nameservers. The CDN handles everything else automatically.