The fastest, most organized and most efficient people, processes and devices all have a couple of things in common. The first is that they all have a system; some sort of method for making order out of chaos, fast. The second thing they all have in common is that they are assuredly better than the rest of us.
Don’t despair if you’re the type of person who stuffs 60 unpaired socks into a drawer directly after they come out of the dryer. Those of us who don’t have systems can purchase the services of the superior people, processes and devices that do and live our best lives with the systems they have worked so hard to hone. Take content delivery networks and cache headers, for example.
There are millions of different websites on the internet with billions of different people accessing them. And yet, for all of humanity’s differences, there is one thing we all want from every website: speed. Luckily there is one thing that can give any website more of it.
A content delivery network or CDN is a global network of cache servers that improves your website performance and reduces page load time in two main ways. Firstly, by directing each user to the cache server physically closest to them, a CDN cuts down on how far data has to travel between a browser and a server, greatly improving round trip time and therefore page load time.
Secondly, a CDN also serves up intelligently cached content instead of necessitating a trip to the origin server every time a user’s browser requests a new page or file.
Cache only establishment
Your website, quite likely, is made up of an impressive amount of content. Static content, dynamic content, multimedia content, HTML files, the list goes on. When you have a website with a single server and no caching ability, every single time someone goes to your website and goes to a new page, their browser has to request your content from the origin server. Every time. To use a euphemism, this is less than optimal.
The cache servers in a CDN store your website’s content in order to serve it to users as quickly as possible from the server closest to them. Cache servers typically cache static content, but more advanced CDNs also have the ability to cache dynamic content for the period in which it remains static. As CDN provider Imperva Incapsula points out, CDN caching not only improves website performance, but it also decreases your bandwidth costs and helps ensure reliable content delivery.
How the magic happens
How CDN caching actually works involves one of those nifty systems mentioned in the opening. A CDN is able to identify your website’s cacheable content as well as understand the rules for caching this content thanks to HTTP cache headers used by your web developer which tell your CDN everything it needs to know. When it comes to your web content, the following are the most relevant headers.
Expires. About as straightforward as cache headers come, this one tells the CDN when the content is going to ‘expire,’ forcing the CDN to fetch content from the origin server, ensuring that even your cached content is fresh. Efficient though this tag may be, it has largely been replaced by the next one.
Cache control. Think of this one as Expires 2.0. Cache control tells your CDN when the content has to be re-fetched, as the Expires header does, but it also allows you to identify content as private or public, identify it as no-cache, forcing the CDN to revalidate content before delivering it to a user, or identify content as no-store, which prevents confidential data like banking information from being stored.
Pragma. Speaking of headers superseded by cache control, pragma headers were previously used to set caching instructions for browsers.
Surrogate. The surrogate headers allow you to set caching policies with the authority of the origin server, which therefore gives you better control over cache policies.
ETag. This is where the sorting and labeling gets pretty sophisticated. ETag headers allow you to provide your content with unique identifiers. It also eliminates unnecessary content re-fetching because when your content has expired according to your cache control or expires headers, the proxy server simply sends the ETag to the origin server to check if the relevant content has been changed. If it has remained the same, the cache server does not bother re-fetching.
Vary. When your website has multiple versions of the same content – say compressed files stored alongside uncompressed versions – you would use vary headers to dictate how the content is used in different situations. For instance, some browsers aren’t equipped to handle compressed content, so a vary header is used to serve the content in its uncompressed format, ensuring smooth loading of your website. Vary headers can be super useful, but not every browser handles them well so they need to be used judiciously.
Nothing to worry about
It’s possible that you created your own website and cannot recall ever using a cache header. It’s also possible that you have no reason to believe that the person you paid to develop your website ever used a cache header. That’s all okay.
Advanced CDNs will actually have intelligent cache control that is predictive through learning. They can monitor, categorize and cache a wider range of content all the while largely eliminating the hands-on part of the caching process, making their own processes more efficient while saving you time and probably some swearing. With your CDN taking care of caching, you can focus more energy on that sock drawer and never have to spend another meeting hoping no one will notice you’re wearing one polka dot sock and one in a shade of business grey.