Getting the Most Out of the Distil Networks CDN

November 7, 2013 Andrew Stein

Outside of the bot blocking technology, another big draw is our built-in Content Delivery Network. Just by being a Distil Networks customer, we can use our global network to improve your page’s load time while still protecting that page from bots.

In telling people about the CDN service, many expect massive decreases in their page load times just by moving onto the network. While there is almost always at least a slight benefit, we often see that many websites aren’t actually optimized to take advantage of a CDN. Even though their static content (images, CSS, and JavaScript) caches perfectly on our global network, many websites don’t have their actual rendered HTML content optimized for a proxy cache. 

The biggest reason for this are the HTTP response headers that are being sending back to the Distil network. While we allow customers to set their caching options in the Distil portal, Distil respects any caching headers we get from the upstream website. This means that even if you tell Distil Networks to cache your webpage for 24 hours, if your website says “don’t cache this page ever” we won’t cache that page. Three headers in particular Expires, Set-Cookie, Cache-Control often cause our customers’ websites to be retrieved purely from their origin, rather than taking advantage of the CDN.

Before launching into explaining the headers, you’ll need to be able to view them. There are a couple of ways to do so, but the easiest is using Google Chrome’s built in Developer Tools. Before loading your website (or after loading, though you’ll have to refresh) bring up the Developer Tools which you can do by going to the 3 horizontal bars button right near the URL, then going to Tools > Developer tools. Then navigate to the “Network” tab and load your page. Every asset on your page will then begin appearing in the Network panel as your website loads. Find the page you’re looking for in the list and select it. After doing so, you’ll be taken to the “Headers” tab. Skip past the “Request Headers” as all of the values we’re looking for are found under “Response Headers”.

All of the headers listed below may not appear (and if this is the case, Distil will use the default settings set in the portal), however it’s important to note that if even one header exists that flags a page as uncacheable that’s always respected, regardless of what other headers are set.

Expires
The easiest to use caching header has to be Expires. If none of the other headers are set, this header says whatever receives this asset can hold onto it for this many seconds or, if the value is a timestamp, then it can be held until this point in time. If you view the headers of your webpage and you see just “Expires: 360” this means that a customer’s PC, the Distil CDN, or any other CDN can cache this page for 360 seconds.  On the flipside, if Expires is set to 0 or a date in the past, this means that this page is uncacheable.

Set-Cookie
If you view your websites’ headers and you see “Set-Cookie”, you’re viewing a page that Distil won’t cache. The reason for this is simple, when most websites are looking to set a cookie, they’re looking to set a cookie for one user and one user only. If we allowed this header to be cached, we risk causing issues for any of our customers who have login systems or have in-page tracking cookies set by the HTML document itself.

Cache-Control
Cache-Control is easily the most powerful of all the caching headers. As the HTTP specification outlines (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 ) this header gives users full control of how and where their content can be cached. As a general rule, if you’re trying to use Distil as a CDN, then a Cache-Control with a value of “private”, “no-cache”, “no-store”, “max-age: 0”, or “s-maxage: 0” then Distil will be unable to cache the page. To get to a point where Distil can cache the page, you’ll want the Cache-Control header to set to “public”.

Now, none of this is meant to say “you should absolutely do this for your website”. Caching is something that is incredibly site-specific and only you know what aspects of your page can be cached and what pages need to be pulled directly from the origin. For example, a blog that updates once a day lends itself to caching much better than a web store that has live displays of what’s in stock. You wouldn’t want active threads on your message board cached, though a locked or old thread on that same message board is perfect fodder to a CDN.

If you’re curious whether you’re viewing a cached page from Distil, you can use the X-Distil-CS header in your “Response Headers”. If this value says “HIT” that means you’re viewing a page from the Distil cache. A “MISS” value could mean that we either cannot cache the page in question or that it just hasn’t been loaded into cache yet. If you’ve loaded this page several times and are still seeing only MISS, chances are that means the page is currently uncacheable. If you see “BYPASS” this means that your settings within the Distil portal tell us to not even attempt to cache this kind of file.

By understanding what caching headers your website is sending and when, you can take full control of how much you’re able to benefit from Distil Networks CDN. 

About the Author

Andrew Stein

Andrew Stein is Distil's Co-Founder and Chief Scientist. Getting his start running a large online kids’ game, Andrew took his passion for web development to NC State where he became the Senior Web Developer for the Department of Electrical and Computer Engineering. Working on everything from identity management programs to digital signage systems, Andrew has run into a little bit of everything and is always eager for new challenges.

Follow on Twitter More Content by Andrew Stein
Previous Article
Using Redis and Websockets to Monitor our Data Centers in Real Time
Using Redis and Websockets to Monitor our Data Centers in Real Time

Distil Networks wanted to build a dashboard to monitor their data centers worldwide as close to real time a...

Next Article
False Dilemma: 99.9% Bot-free Traffic
False Dilemma: 99.9% Bot-free Traffic

The false dilemma for digital publishers is that they can choose whether or not they guarantee their traffi...