November 2013 – Things I wish I'd Known Earlier

I have been working with various things designed to allow me to control the mousetech.com domain assets in a more centralized way. One of them was to try and use Puppet to provision machines. Puppet is a fairly nice tool, but there are some unexpected pitfalls.

There are several ways to get puppet on a CentOS 5 server. If you’d a glutton for punishment, you can always pull down the source and build from scratch, but I don’t recommend that when just getting started. You can also pull it via YUM from the EPEL repository. Or, you can import the Puppet Labs repo in and pull from there.

I already have EPEL in my stock set of YUM repositories, so that’s what I went for first. In the beginning all was fun and games. Then I got more ambitions and started defining modules. This didn’t work. Worse, the sample documentation used commands that didn’t work. It was getting very frustrating.

It became obvious fairly early that some significant changes have been made to puppet and that what I had wasn’t the Latest and Greatest. That would have been OK, except that attempts to read the online documentation for the older stuff kept leaking back into docs for the newer stuff (a not uncommon problem, best handled I think by archiving the old docs as self-contained PDFs). On top of that, the version of puppet that I was running was sufficiently antique that much of the documentation had fallen off the website (see previous).

To add to the confusion, I wasn’t really sure WHICH version of puppet I was running, since their enterprise product doesn’t keep quite the same set of version numbers as their community version plus I suspected the version (2.6.18) in the EPEL RPM wasn’t indicative, either. I finally came to the conclusion that 2.6.18 – which is the version that Centos10 pulled actually correlates to the community 2.5 version, which is something like 2.3 in Enterprise versioning.

At this point, I went to the source: Puppet Labs and found out about their repo. Unfortunately a network-based RPM install failed for obscure reasons (I’m not sure whether I have lingering LAN issues or it’s them). Fortunately, I was able to wget and install locally. After which I was able to install a Version 3 Puppet, the documentation now matches the commands and module processing works the way they said it did.

One last fly in the ointment, though. It seems that nodes and classes share the same namespace. I was using the same name for my guinea-pug machine node and one of the packages it was trying out and while both node and component parsed, the actual execution was only done against the node – the package was silently ignored. I fixed this by changing the node name to its fully-qualified domain name.

Sometimes they just gang up on you.

I was migrating my sendmail server from a NAT address to a bridge address when it all started.

Xen has this really nasty habit of zapping your hardware MAC address if you don’t get the nat routing configure just right. There’s obviously some way to get it to revert, because occasionally for no obvious reason, the real MAC address will revert, but don’t try searching the web for an answer – all you’ll get is fruitless inquiries and flame responses (you shouldn’t be changing your MAC address, idiot!). Please. There are very good reasons why it’s useful to be able to set a custom MAC address. One place I worked coded their hardware asset IDs in the MAC to assist their DHCP server, for instance.

On the mousetech domain, I’d be happier if it didn’t happen. As it is, the MAC addresses of the primary and secondary NICs got swapped and I didn’t find out until I’d gotten most of the way through fixing things. So the former eth0 became eth1 and vice versa.

Shortly thereafter, outbound mail started bouncing with the infamous “mail loops back to me” message. Since I’d just done major relocation on the mail VM, I wasted a LOT of time messing around with sendmail options to no avail. Normally this message can be cured by putting in a valid MX record in DNS and/or adding all the mailserver alias names to the sendmail local-host-table (cw table). Not this time.

I was fairly sure that the problem had something to do with the fact that the physical host had been set up to forward all port 25 (smtp) requests to the mail VM and that somehow the wrong IP address was getting mixed in, but I could see the actual routing since it was all internal and specific to port 25 to boot.

Turns out I’d been sloppy when I fixed up the iptables forwarding. The correct version (after the NIC mixup) looks like this:

-A PREROUTING -i eth0 -p tcp --dport 25 -j DNAT --to-destination 10.0.1.8:25

Where I went wrong was in being lazy and omitting the NIC ID (eth0) when I repaired the damage that Xen did. As a result, BOTH NICs were being re-routed – the actual internet-facing NIC (which should be routed) and the internal DMZ bridge-facing NIC (which should not). As a result, traffic on port 25 for eth1 was being routed back on itself and sendmail complained.

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Month: November 2013

The curse of the mad Puppet

[SOLVED] mail loops back to me (MX problem?) for virtual machine