Tuesday November 11, 2008 at 22:25
Subject: Other load-balancing options...
Keywords:
Load-balancing, Technical
Posted by: Sean Reifschneider
Tonight at our LUG we had
a great presentation about building a large scale LTSP network. As part of this they
needed to spread the load out across a number of different machines. But
they didn't have the ability to deploy a traditional stand-alone
load-balancer.
I mentioned options of using CLUSTERIP or unifying the load-balancer
with the application machines, so that a dedicated load-balancer isn't
needed. I wanted to give some more information about these options because
they aren't as well known as the more traditional methods. Read on for
more information.
CLUSTERIP is an iptables module which allows the same IP address to be
set up on multiple machines. It uses a multicast MAC address. All
machines receive the incoming requests, but those based on a hashing
algorithm all but one of the machines ignore the requests. You can specify
hashing to be based on the source IP, IP/port, or source IP/port and
destination port.
Up front you have to specify how many nodes are in the CLUSTERIP
cluster, and a node number for each node that it runs on. So, adding or
removing nodes requires either a complete restart of the CLUSTERIP rules,
or to set up enough CLUSTERIP node rules to handle expansion, but some
clusters will have multiple node numbers, perhaps an uneven number.
Distributing the CLUSTERIP rules is probably best done by linux-ha.
However, you can also run a traditional load-balancer to distribute
the load. There is a recipe for this topology
which has the load-director running on the same node as the services at the
Ultramonkey site.
This is mentioned for use in two-node clusters serving both as
load-balancers and handling requests. However, it should be possible to
have more than just the two service nodes. The benefit of this mechanism
is that you can use the normal load-balancing algorithms, and spread the
load unevenly, add and remove nodes, etc...
The method that they selected was to set up a DHCP server on each
service node, and have each one have 1/Nth the number of leases. Once one
server fills up, leases from the next will start being used. This is a
fine solution, and is quite simple (which is good), but it may also limit
some flexibility and performance that you might otherwise see if the load
were more evenly split based on actual usage.
Considering the tight timeline that they were under in the deployment
being presented about, the DHCP mechanism is probably the best solution.
However, I did want to mention that there were other alternatives.
(Post Reply)
(Post Reply)