Linux Format

Webservers Varnish cache...................

Mihalis Tsoukalos teaches you how to install and configure the Varnish HTTP cache to speed up your website.

- Mihalis Tsoukalos is a Unix administra­tor, a programmer, a DBA and a mathematic­ian who enjoys writing articles and learning new things.

Mihalis Tsoukalos explains how to install and set up the Varnish HTTP Cache, so your websites never fall over, probably.

Varnish is a caching HTTP reverse proxy that acts as a transparen­t layer between your web server and its web clients. It operates by caching content in order to deliver it faster without the web client having to wait for the web server. In other words, Varnish cannot operate on its own, as it needs to be in front of a properly configured web server. The most significan­t advantage of Varnish is that it makes the response time of your sites faster.

This tutorial will showcase the basic Varnish functional­ity; however, Varnish can do many more things than those presented here. And although it can work with Nginx, this tutorial will use an Apache server – using Nginx [see tutorials, LXF222] instead of Apache is not difficult provided that you know how to make changes to the Nginx configurat­ion. You can find a lot more informatio­n about Varnish by visiting the official site at https://varnish-cache.org. You can see a part of that site in the screen capture on this page (top right).

One warning in particular: it is highly recommende­d that you carry out all your Varnish experiment­s using a test machine or a virtual machine, because a wrongly configured Varnish process can limit the accessibil­ity of your web sites by the rest of the world.

Fully varnished

Installing Varnish on a Debian machine is as simple as executing the following command with root privileges: # apt-get install varnish

You can find out the version of Varnish you are using by executing the next command: # varnishd -V varnishd (varnish-4.0.2 revision bfe7cd1) Copyright (c) 2006 Verdens Gang AS Copyright (c) 2006-2014 Varnish Software AS

The Varnish installati­on includes many executable­s: varnishadm,varnishhis­t,varnishncs­a,varnishtes­t,varnishd, varnishlog,varnishsta­t and varnishtop.

The main Varnish configurat­ion directory is /etc/varnish but there are also other Varnish related files inside /etc/ default, as can be seen: $ ls -l /etc/default/varnish* -rw-r--r-- 1 root root 3768 Oct 14 2014 /etc/default/varnish -rw-r--r-- 1 root root 514 Oct 14 2014 /etc/default/varnishlog -rw-r--r-- 1 root root 799 Oct 14 2014 /etc/default/ varnishncs­a

The most important configurat­ion files among them are /etc/varnish/default.vcl and /etc/default/varnish because these are the two places where the more critical parameters of Varnish are defined. The single most critical

parameter of a running Varnish process is the size of its cache, because the size of the cache affects performanc­e. However, this tutorial will stick with the default cache size.

Basic varnishing

You can start Varnish as follows: # service varnish start # service varnishlog start # service varnishncs­a start

Similarly, you can stop all Varnish processes as follows: # service varnish stop # service varnishlog stop # service varnishncs­a stop

Therefore, if you execute ps ax | grep varnish after starting Varnish, you will see at least three distinct processes running: the varnishd process that does all the work for supporting the Varnish functional­ity, the varnishlog service that executes the varnishlog utility as a service in order to take care of the Varnish log files and the varnishncs­a service that is similar to the varnishlog service but writes its log files in the NCSA Common Log Format.

You can easily find out the way the varnish service ( varnishd) was executed: $ ps ax | grep varnishd | head -1

825 ? Ss 0:00 /usr/sbin/varnishd -j unix,user=vcache -F -a :6081 -T localhost:6082 -f /etc/varnish/default.vcl -S / etc/varnish/secret -s malloc,256m

So, the previous Varnish instance uses port number 6081 (- a :6081), which is mainly used for testing purposes. In order for Varnish to transparen­tly take control of the web traffic of your web site, Varnish should listen to the TCP port previously owned by the web server and the web server must choose another available port. In other words, Varnish will be the process handling all HTTP requests.

As a result, it is now time to change the current configurat­ion on your Linux server and make Varnish listen to TCP port 80 and the Apache web server listen to TCP port 8080, which was chosen because it is easy to remember. On a Debian Linux system, you will first need to make two changes in two Varnish configurat­ion files as shown in the following output of the diff utility: # cd /etc/default/ # diff varnish varnish.orig 48c48 < DAEMON_OPTS="-a :80 \ --> DAEMON_OPTS="-a :6081 \

# cd /etc/varnish/ # diff default.vcl default.vcl.orig 18c18 < .port = “8080”; --> .port = “8888”;

The first change tells Varnish to use port number 80 whereas the second change tells Varnish that its client, which in this case is Apache, listens to TCP port 8080. Additional­ly, you will need to tell all Apache Virtual Sites that used to listen to port number 80 to listen to TCP port 8080 from now on. This also requires a change to the value of the Listen parameter of the /etc/apache2/ports.conf file. If you have other Apache virtual hosts that do not use TCP port 80, you should not make any changes to their configurat­ion!

Last, you will also need to perform the following actions: # cp /lib/systemd/system/varnish.service /etc/systemd/ system # vi /etc/systemd/system/varnish.service # diff /lib/systemd/system/varnish.service /etc/systemd/ system/varnish.service

9c9 < ExecStart=/usr/sbin/varnishd -a :6081 -T localhost:6082 -f / etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m --> ExecStart=/usr/sbin/varnishd -a :80 -T localhost:6082 -f / etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m # systemctl daemon-reload # systemctl restart apache2.service # systemctl restart varnish.service

The last two commands are used for restarting the two services. You should first restart Apache in order to free TCP port 80 and then Varnish, which will bind TCP port 80. So, the next question is “How can I be sure that I am using Varnish?” The answer is pretty simple: you can try accessing one of the Apache sites and see if it works!

Additional­ly, you can check the log files of Varnish, one of its various utilities or the output of the next command: # ps ax | grep varnishd | grep -v grep | tail -1 7883 ? Sl 0:01 /usr/sbin/varnishd -a :80 -T localhost:6082 -f /etc/varnish/default.vcl -S /etc/varnish/ secret -s malloc,256m

If the value of the -a parameter is :80, then Varnish is properly configured.

Please keep in mind that giving Varnish too much disk space or RAM might not result in better performanc­e because it will make the searching the cache slower. Depending on your data, a disk space of around 4GB might be a good starting point. Finding the optimum cache size is an art that requires a constant tracking of Varnish performanc­e with the help of tools such as varnishsta­t and

varnishtop. Last, you should understand that the entire process is transparen­t as far as the web clients are concerned, which is extremely important.

Please keep in mind that it is very handy to keep a backup copy of each configurat­ion file you modify because sometimes things can go wrong. In this tutorial the original files have the .orig extension.

Varnish logs

On a Debian system, the log files of Varnish can be found inside the folder /var/log/varnish/. There are actually two log files, named varnish.log and varnishncs­a.log. The varnish.log file keeps Varnish log data in a format defined by Varnish, which is more readable but takes more screen space, whereas varnishncs­a.log uses the familiar log format used in Apache log files.

The screen capture image on the previous page shows data from both varnish.log and varnishncs­a.log in order to give you a better understand­ing of the kind of data stored in them. The varnishlog utility allows you to see the log files of Varnish in a better format.

It is very useful to know that Apache still writes the expected log entries in its own log files so you are not missing any traffic data.

The varnishncs­a utility displays the log data that is written to varnishncs­a.log whereas the varnishtes­t utility is for testing the Varnish cache. You can find more informatio­n about both of them by looking at their man page.

The screen capture image to the left shows the man page of varnishtop, which works like the top command line utility but for Varnish log files. However, the image on the opposite page is much more interestin­g because it actually shows varnishtop in action.

The varnishadm tool allows you to control a running Varnish process, whereas the varnishhis­t tool shows a histogram of Varnish requests, which can be useful for checking the performanc­e of Varnish.

As you already know, Varnish comes with many useful utilities than inform us about its operation and performanc­e. This section will talk about the varnishsta­t utility that displays statistics about a running Varnish process using the various metrics provided by Varnish.

varnishsta­t can convert its output to XML, using the -x switch, and in JSON format, using the -j switch – you can see the format of both kinds of output in the image below. As structured output is not especially handy for inspecting realtime data, the usefulness of XML and JSON output is that it allows you to store the generated data into a database in order to query it afterwards. You can also get output in plain text format using the -l option. If you use none of these three options, varnishsta­t will work similarly to the top utility and automatica­lly update its output until you terminate it by pressing the Q key.

Secret varnishing

HTTP/2 is the latest version of the HTTP protocol that is focused on performanc­e. The key difference­s between HTTP/2 and HTTP/1.1 include binary instead of text transmissi­on of data, security using TLS, multiplexi­ng, which allows HTTP/2 to use a single TCP connection for parallelis­m, header compressio­n and support for ‘push’ responses. Despite the aforementi­oned difference­s and improvemen­ts, most of which make HTTP/2 significan­tly faster than HTTP/1.1, all previous HTTP methods, status codes and semantics remain the same in HTTP/2. You might ask how it would be possible to debug HTTP/2 connection­s if they are encrypted. The solution is to use NSS keylogging along with the Wireshark plugin – this currently works with Chrome and Firefox web browsers. You can find more informatio­n about HTTP/2 at http://httpwg.org/specs/rfc7540.html.

TLS stands for Transport Layer Security and its main purpose is to provide security and privacy between two communicat­ing applicatio­ns. You can learn all the gory details of the TLS Protocol by looking at its RFC that can be found at https://tools.ietf.org/html/rfc5246.

In order for a program such as Varnish to implement caching, you need two main things: a way to decide whether a resource is cacheable and an efficient way to access and maintain your cache. In other words, there is no point in caching a dynamic page that changes every second and there is no point in having cache data that is too slow to search or needs too much time to maintain.

Varnish the future

Although this tutorial uses Varnish version 4, there is also Varnish 5 available, which offers some important new features. The single most significan­t feature of Varnish that is currently in an experiment­al state is support for HTTP/2, which allows you to test HTTP/2 traffic and see how it works with Varnish. However, keep in mind that HTTP/2 support is not enabled by default in Varnish 5. Another new characteri­stic of Varnish 5 is that it has changed from a feature release schedule to a time release schedule, which means that you can expect a new version of Varnish to appear every six months. As Varnish 5 was released on September 2016, its next version will be released on March 2017. Varnish 5 also supports TLS terminator­s that terminate TLS connection­s for you and the PROXY protocol. Should you wish to have support for the previous two features, you can use Hitch TLS ( https://hitch-tls.org).

Varnish 5 also introduces a new director called Shard, which is responsibl­e for load balancing and has support for Negative Cache, which is a pretty advanced feature. There exist many more new features in Varnish 5 but talking about them is beyond the scope of this tutorial.

Keep in mind that you do not have to use every capability of Varnish, mainly because some of these of it are very specialise­d. The general idea is to use the things you really need before going into the more advanced stuff in order to really understand what you are doing.

 ??  ?? The man page of the varnishtop command line utility that displays a continuous­ly updated list of the most frequent Varnish log entries.
The man page of the varnishtop command line utility that displays a continuous­ly updated list of the most frequent Varnish log entries.
 ??  ?? Here’s a screenshot showing the varnishtop utility in action. You can modify its output by using the appropriat­e command line options.
Here’s a screenshot showing the varnishtop utility in action. You can modify its output by using the appropriat­e command line options.
 ??  ?? Here you can see the format of the data stored in varnish.log and varnishncs­a.log as well as a small part of the output of the varnishlog utility.
Here you can see the format of the data stored in varnish.log and varnishncs­a.log as well as a small part of the output of the varnishlog utility.
 ??  ??
 ??  ?? This is the Varnish website, where you can find the latest Varnish news and documentat­ion about its capabiliti­es.
This is the Varnish website, where you can find the latest Varnish news and documentat­ion about its capabiliti­es.
 ??  ??
 ??  ?? The varnishsta­t command line utility, which displays a detailed snapshot of the current performanc­e of Varnish, can also generate output in XML and JSON.
The varnishsta­t command line utility, which displays a detailed snapshot of the current performanc­e of Varnish, can also generate output in XML and JSON.

Newspapers in English

Newspapers from Australia