High Availability Web Server

The next single point of failure that I tackled was the web server. Prior to this, I had added two load-balancers with automatic fail-over, but these were both using a single backend web server�and so were not really balancing anything at all.

To add redundancy at the web server level, I needed a second web server. The main issue caused by this is the the content on both web servers needs to be kept in sync. Fail-over is not really an issue, as the load-balancers already deal with this. HAProxy can monitor the health of the backend web servers, and will stop directing traffic to any that are down. This can be enabled by using the check configuration option.

To keep the primary and backup web servers in sync, I used lsyncd�which monitors a directory, and rsyncs over and files that change. This was fairly simple to configure.

----
-- User configuration file for lsyncd.
--
sync{default.rsyncssh, source="/var/www/ghost", host="david", targetdir="/var/www/ghost", rsync={_extra={"--omit-dir-times"}}}

This configuration rsyncs any changed files from my blog content directory of the primary webserver, call, to the same directory on the backup web server, david. I also modified the default lsyncd systemd unit file, both because I did not want it running as root, and to ensure that the daemon restarts automatically if it fails.

One issue I encountered was that rsync was unable set the last modified times of the directories that it was writing on the backup server. I created a lsyncd user for lsyncd to run as, and gave it write permissions on the files that it would need to write in order to keep the blog content in sync, but the lsyncd user was not the owner of these files. This causes a problem for rsync�even though it technically has all the permissions it needs to set the last modified date.

The arises because rsync uses the utime function of the GNU C Library to set the last modified time. utime will return an error�EPERM�if the caller's effective uid does not match that of the file. To work around this, I pass the ??omit-dir-times flag to rsync, which prevents it from attempting to modify the directory times.

Another problem is that the backup cannot be written to directly by the blogging application, as any changes would get overwritten by lsyncd. To solve this, I used HAProxy access lists.

    acl admin_request path_beg <path_to_admin_pages>
    use_backend www-backend-admin if admin_request
    default_backend www-backend

This means that www-backend-admin, containing only the primary server, is used for all pages with paths beginning with a particular pattern. For other requests, www-backend is used, which contains both primary and backup servers.

Overall, this lsyncd solution has the advantage of being simple, but only provides high-availability to readers of the blog. Ideally I would also provide high-availability to the editor. That, however, will require a more complex solution, as the backup will need to be able to become the primary whenever the primary is down.