Forums.ATC.no

Teknisk => Generelt teknisk => Emne startet av: Floyd-ATC på 24. August 2015, 20:50 pm

Tittel: High Availability Network Load Balancing with CentOS 7
Skrevet av: Floyd-ATC24. August 2015, 20:50 pm
I have found thousands of articles on the net explaining how to set up a network load balancer using keepalived and haproxy in CentOS 7, in a few easy steps. Unfortunately, and not really surprising, most of those articles are incomplete -- basically pieces of a bigger puzzle. Instead of promising a complete solution to all your problems, I will therefore instead try to explain how I solved the various problems I ran across, hoping that they might be useful to someone else and perhaps inspire someone to point out flaws in my own setup.

Step 1: Install "keepalived" and "haproxy"
Because both are in the standard repo, this is pretty straight-forward:
Kode: [Velg]
yum install -y keepalived haproxy
Step 2: Punch the necessary holes in the firewall
You could open one port at the time, but one article pointed out that it's better to collect everything in a simple XML file so you remember exactly what you opened for haproxy based services to work. Create a file called "/etc/firewalld/services/haproxy.xml" with the following for starters:
Kode: [Velg]
<?xml version="1.0" encoding="utf-8"?>
<service>
<short>HAProxy</short>
<description>HAProxy load-balancer</description>
<port protocol="tcp" port="80"/>
</service>
The cool thing about this is that not only can you use this file to punch holes in the firewall, it can also be used to make SElinux understand that haproxy is allowed to listen on those ports:
Kode: [Velg]
restorecon /etc/firewalld/services/haproxy.xmlThen open the firewall:
Kode: [Velg]
firewall-cmd --add-service haproxy --permanentWhile you're at it, don't forget that keepalived needs to send and receive VRRP messages. If not, both keepalive daemons will incorrectly assume that the partner is dead and they will both try to become MASTER:
Kode: [Velg]
firewall-cmd --direct --permanent --add-rule ipv4 filter INPUT 0 --destination 224.0.0.18 --protocol vrrp -j ACCEPT
firewall-cmd --direct --permanent --add-rule ipv4 filter OUTPUT 0 --destination 224.0.0.18 --protocol vrrp -j ACCEPT

Step 3: Configure keepalived
This daemon will be used for High Availability, you can use as many load balancers as you like and they all need a reachable IP address for management but only one of them will be MASTER at any time and respond to the Virtual IP address where the haproxy will listen. Here is an example "/etc/keepalived/keepalived.conf" file:
Kode: [Velg]
! Configuration File for keepalived

global_defs {
   notification_email {
     you@your_domain
   }
   notification_email_from keepalived@your_host.your_domain
   smtp_server ###.###.###.###
   smtp_connect_timeout 30
   router_id your_host.your_domain
}

vrrp_instance VI_1 {
    state MASTER
    interface ens160
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        ###.###.###.### label ens160:10
    }
}

Notice the "label ens160:10". By default, the Virtual IP address will not actually be visible when using "ifconfig" or other network tools, which can be very confusing. By adding a "label", keepalived will register a virtual network interface for no other purpose than to let "ipconfig" show the Virtual IP address. Just make sure the name conflict with other interfaces.

If you're concerned that multiple nodes become available at the same time and all try to become master, or perhaps you want a certain node to take preference, adjust the "priority" (higher value wins the election) and consider setting the initial state to BACKUP. As long as VRRP communication works, the nodes should be able to manage this perfectly on their own though.

One more thing, we need to adjust the startup parameters to keepalived. Do this by editing "/etc/sysconfig/keepalived":
Kode: [Velg]
# Options for keepalived. See `keepalived --help' output and keepalived(8) and
# keepalived.conf(5) man pages for a list of all options. Here are the most
# common ones :
#
# --vrrp               -P    Only run with VRRP subsystem.
# --check              -C    Only run with Health-checker subsystem.
# --dont-release-vrrp  -V    Dont remove VRRP VIPs & VROUTEs on daemon stop.
# --dont-release-ipvs  -I    Dont remove IPVS topology on daemon stop.
# --dump-conf          -d    Dump the configuration data.
# --log-detail         -D    Detailed log messages.
# --log-facility       -S    0-7 Set local syslog facility (default=LOG_DAEMON)
#
KEEPALIVED_OPTIONS="-D -P -S 1"

Step 4: Configure haproxy
This daemon is used to perform load balancing, spreading traffic across all of the backend servers.
Most of the articles I found presented a very short config for haproxy, which is pretty cool if you want to make it seem like haproxy takes very little effort to configure. Here's a more realistic "/etc/haproxy/haproxy.cfg" config file to start with:

Kode: [Velg]
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    # Forward log messages to syslog (localhost is okay)
    log         ###.###.###.###:514 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000

#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend http *:80
    #acl url_static       path_beg       -i /static /images /javascript /stylesheets
    #acl url_static       path_end       -i .jpg .gif .png .css .js
    #use_backend static          if url_static

    default_backend             web

#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
#backend static
#    balance     roundrobin
#    server      static 127.0.0.1:4331 check

#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend web
    #balance     roundrobin
    balance leastconn
    server  web1 ###.###.###.###:80 check
    server  web2 ###.###.###.###:80 check
    server  web3 ###.###.###.###:80 check

Step 5: Make sure haproxy can bind to the Virtual IP address on BACKUP nodes
By default, the Linux kernel does not allow a process to listen on a non-existing IP address. This is a problem for us because that address will only exist on the MASTER node. Fortunately, there is a sysctl setting for this exact situation and CentOS 7 makes it very easy to change those. The only side-effect is that any service configured with an incorrect IP address will no longer generate an error, possibly making the error difficult to spot. Simply create a file "/etc/sysctl.d/haproxy.conf":
Kode: [Velg]
net.ipv4.ip_nonlocal_bind=1
Step 6: Configure rsyslog
We will create two drop files, one for each daemon. Here is an example "/etc/rsyslog.d/keepalived.conf":
Kode: [Velg]
$ModLoad imudp
$UDPServerRun 514
$template Keepalive,"%timegenerated% your_host %syslogtag% %msg%\n"
local1.* -/var/log/keepalived.log;Keepalive
### keep logs in localhost ##
local1.* ~

And here is an example "/etc/rsyslog.d/haproxy.conf":
Kode: [Velg]
$ModLoad imudp
$UDPServerRun 514
$template Haproxy,"%timegenerated% your_host %syslogtag% %msg%\n"
local2.=info -/var/log/haproxy.log;Haproxy
local2.notice -/var/log/haproxy-status.log;Haproxy
### keep logs in localhost ##
local2.* ~

Notice that I have included the actual host name in the templates. This is useful if you decide to forward messages to a remote syslog messages, where the default hostname "localhost" will be completely useless.

Step 7: Configure logrotate
We've all forgotten this at some point, haven't we?
Here's "/etc/logrotate.d/haproxy":
Kode: [Velg]
/var/log/haproxy.log /var/log/haproxy-status.log {
    missingok
    notifempty
    sharedscripts
    rotate 7
    daily
    compress
    postrotate
        reload rsyslog >/dev/null 2>&1 || true
    endscript
}

And here's "/etc/logrotate.d/keepalived":
Kode: [Velg]
/var/log/keepalived.log {
    missingok
    notifempty
    sharedscripts
    rotate 7
    daily
    compress
    postrotate
        reload rsyslog >/dev/null 2>&1 || true
    endscript
}

Step 8: Restart all relevant services
You should now be able to restart everything, either one service at a time or simply by rebooting.
Kode: [Velg]
sysctl --system
firewall-cmd --reload
systemctl restart rsyslogd
systemctl start keepalived
systemctl start haproxy

Really? That's it?
Now, one last problem caused some head-scratching, and it wasn't actually a problem with the Linux setup. I'm using a Juniper SRX router/firewall which by default ignores Gratuitous ARP because while it does have legitimate uses (like, say, a VRRP host announcing that it has taken over a Virtual IP address) it can also be used for malicious things like man-in-the-middle attacks. Assuming your servers are on a network that you control, this shouldn't be an issue. (Or rather, if your servers are on a network where this might be an issue, you might want to ignore load balancing for now and solve that problem first) But I digress. The end result of having a router ignore GARP is that the router will merrily keep forwarding packets to your (presumably dead) load balancer and not the new MASTER. The solution for Juniper SRX is quite simple, just set "gratuitous-arp-reply" on the interface(s) facing your server network(s). Check your router docs if you see the same symptom.

Anything else?
Is anyone able to write an article that doesn't miss a single thing? I don't think I ever read one, and I know for a fact I never wrote one. Check your log files, test the service and start troubleshooting. Then test again. Reboot. Test more. Feel free to comment below :-)