I spent most of yesterday working with Scott to get a high-availability cluster deployed for a new hosting client. However, right as I was getting ready to walk out the door, a long-time client called needing help with their e-mail server. We do most of our work remotely, so this machine is around 2,000 miles away.
Some quick diagnostics showed that the system was unreachable from both the external network and their internal network. However, the machine was up. I had her run “mii-tool” on the interface and found that it was reporting no link-beat. However, when that same cable was plugged into the other interface, mii-tool would detect link-beat. So, I helped convert the network configuration over to the other interface, and everything was happy.
We spent most of the rest of the day finishing the high-availability cluster setup for a new hosting client. The “drbdlinks” program I wrote on Tuesday made things a fair bit easier. I had asked the folks on the Linux-HA IRC channel if there was any interest for including it in the heartbeat package, but the only response I got was that sym-linking was a bad idea. My attempts to get further information didn't go anywhere.
I imagine that the potential problems are that the links don't end up working or they get broken and lead to confusion about what file-system you're on. However, the alternative of changing things around so that all of your services are using config files and data from non-standard locations I think would lead to even more problems.
To combat the confusion, we made the non-shared copies of files mention that they were placeholders for the real files which exist on the multi-chassis RAID. That combined with the huge wins of being able to maintain the boxes as you do all other machines is a big win for using the sym-links.
The drbdlinks program includes “start” and “stop” options to make or revert the symlinks. I also decided to include an “auto” mode, which detects whether there's a file-system mounted on the destination partition, and if so it runs start, otherwise stop. I was first going to do this by checking “/proc/mounts” for the mount-point.
Then I realized I could do it much more easily by doing a stat() on the mount-point and the directory above the mount-point, and checking to see if the “st_dev” field was the same for them. “st_dev” is the device a file-system resides on, so if they're the same a file-system isn't mounted there. Yes, you could mount the root file-system also under the mount-point, but that's not the use-case that's going to happen when using drbdlinks.comments powered by Disqus