Train Wreck. How nemo-desktop trashed both my local machine and the LAN

For some time now, I’ve been having problems where the power-save (suspend) feature of my desktop system has failed to put the machine to sleep. In some cases, in fact, the entire machine became black-screen unresponsive.

Examination of the system logs indicated that the suspend function was failing to freeze a number of tasks, thereby aborting the suspend. Indications were that it related to the glusterfs fuse client, and some tweaking of the gluster client and servers to upgrade protocol versions did help, it appeared, but only temporarily.

The other thing that I didn’t like was that the nemo-desktop task was eating one of my cores alive. I actually removed the entire cinnamon system and re-installed it, but that didn’t help. I considered moving back to gnome, but I need those monitoring widgets that gnome 3 in its arrogance dropped, and I keep a lot of icons on the desktop for quick access to hot project resources.

As it happened, I botched a cron definition on a long-running backup job, the server started launching multiple instances of it, and the gluster system took over the LAN. I fixed that, but noticed that gluster was still doing a ton of traffic to my desktop system.

And, incidentally, nemo-desktop response was painfully slow even just to pop up menus. But not regular file-explorer nemo. Only the desktop!

Digging into the toolkit (and google), I found to my horror that for some reason, nemo-desktop was opening, reading, and closing files over and over and over again. And among the files it was chowing down on were a handful of shortcuts (softlinks) to resource out on the glusterfs filesystem.

I deleted all of those links and suddenly all was calm. My network activity dropped to a whisper, and, strangely, nemo-desktop seemed to stop iterating through desktop files. So I’ve lost some convenience, but now the system performs much better (although nemo-desktop still reacts somewhat sluggishly). And power-save features now work reliably.

Creating a Functional Custom CentOS Install DVD

There are several existing How-To’s out on the Internet on the subject of creating a custom CentOS installation DVD/USB storage medium. But unfortunately, actually trying to employ them can be frustrating. So here goes with Yet Another How-To that I hope will fill some of the holes.

Why Custom media?

Why even bother with a custom install? Why indeed. The mousetech.com server farm is a fairly typical in-house setup. It has extensive provisioning capabilities, daily backups, filesystem mirroring and failovers for High Availability. If a server dies, it’s relatively easy to reconstruct it.

But what if a meteor hits the server complex or war breaks out and I have to flee to Argentina? How do I minimize the time and effort required to reconstruct the essential frameworks?

One way is to define a master bootstrap server that can be used to rebuild the main provisioning systems. The master bootstrap doesn’t run in normal operations. It’s independent of them. The normal servers distribute their functions among many machines, VM’s and containers, but the master bootstrap compacts their core functions down onto one temporary machine.

I could do the master bootstrap functions via a stock Centos install DVD set and a kickstart file, but by creating a custom install with the essential packages and kickstart (and customization scripts/data), I can make this a completely unattended operation. And when things are in total disaster mode, the less I need to remember to do, the better.

One note. An unfortunate consequence of the curent CentOS and related OS distros is that the old reliable convention of expecting there to be an eth0 device to network through is pretty much shot. I don’t assume any particular physical machine to be the target of this install, and therefore cannot predict what names the installation will assign to the network ports. So rather than find false comfort, I leave actual NIC setup to manually configuring the /etc/sysconfig/network-scripts after the installation has taken place. Similarly, since I install with no known network or gateway, everything is self-contained – no external servers.

And with that, I begin.

Step 1 – Build a workspace

Making a DVD requires a lot of disk space. So find a place on your build system with lots of room and create a workspace directory. We’ll mostly work out of there. For convenience, I’m going to call this directory “buildiso” and so it’s going to have a pathname of something like /home/timh/buildiso.

cd to this directory. All relative paths given in this howto are relative to this directory.

Step 2 – Start with an existing image

Creating an installation CD from scratch is a monumental task and not worth the effort. So rather than do that, let’s do what everyone else does and modify an existing image. Because this is panic recovery and I want it all on a single medium, I’m going to use the CentOS minimal image and build on it.

So to begin, we make a file-by-file copy of the DVD image. If you have a physical DVD mounted you should be able to copy that. If you have an iso image, mount it like so:

cd /home/timh/buildiso
mkdir mnt     # this is where we'll mount the source ISO file (if we use one)

mkdir bootisoks # this is where we build our new ISO image
mount -t iso9660 -o loop /path/to/centos-disc.iso mnt

Copy the source files from your mounted DVD or loop mount into the bootiso directory. You can use the Linux cp command, rysnc, or whatever you like as long as it copies all files and directories, including the hidden ones.

Once you’ve copied the files, you can unmount the ISO (or DVD). You don’t need it anymore unless you have to go back to the beginning.

Now that we have our model files, change them to be writeable so we can play with them:

chmod -R u+w bootisoks

Create a kickstart file and copy it to the iso image isolinux directory:

cp my_ks.cfg bootisoks/isolinux/ks.cfg

This will end up in the root of the actual DVD we’re creating.

Add any additional RPMs we want to the workspace iso package directory. These can be additional stock RPMs, third-party RPMs or your own custom-built RPMs. They all go directly into the bootisoks/Packages directory.

The CD install process for CentoOS 7 uses yum and the yum repository it uses is stored on the disc itself. Part of the repository infrastructure is the repodata directory and as a precaution, you should make a backup copy of it.

Most of the files in repodata have long twisty names designed to help mirrors keep in sync when distributed. We actually only care about one of them, so we’ll steal that one for later use:

cp bootisoks/repodata/*-comps.xml comps.xml

Gotcha #1

This file defines the installation groups, including the most critical one of all, which is base. So you’re going to need to inject that into the process of building the updated repodata. If you do not, the Linux installer will whine and fail.

Here’s how to properly reconstruct the repodata:

cd bootisoks

rm -rf repodata

createrepo -g ../comps.xml -dp .

cd .. # return to our workspace directory

Gotcha #2

It is critical that the newly-created repodata be pointing to the correct location for the RPMs that will be installed, which is to say the Packages directory. To verify that this happened, you can use this command:

less bootisoks/repodata/*-primary.xml.gz

This is a compressed file, but “less” is helpful and will display it uncompressed.

What you need to see in the ‘package type=”rpm”‘ elements are location sub-elements that look like this:

 <location href="Packages/NetworkManager-glib-1.12.0-6.el7.x86_64.rpm"/>

If you don’t see “Packages/” in the location, then yum won’t look in the DVD’s Packages directory, and it won’t find the RPMs. It will whine profusely and the install will fail.

This is why, contrary to some examples, you should run the createrepo program from the bootisoks directory and not the Packages directory. It will presumably scan (and add) any other rpms it finds in the bootisoks tree, but since there shouldn’t be any, that’s OK. If there’s an option to get the proper location without scanning everything, the createrepo documentation is too vague and my experiments haven’t been productive. Although here’s something I don’t think I tried (Note: I tried. It failed.):

createrepo -g ../comps.xml -dp Packages

Check the primary.xml.gz file and if the location is correct, use that.

The other “gotcha” on package installation can come if you omitted a pre-requisite package needed by one of the packages you are installing. The Linux installer packaging log will name any missing packages, in which case add them to the Packages directory, delete and rebuild the repodata and try again.

Step 3 – Activate the Kickstart file

To use your custom kickstart file, you need to define it to the bootloader directives in the isolinux directory. This is basically a grub menu file named isolinux/isolinux.cfg and you can use sed to update the different boot options in a single swoop like this:

sed -i 's/append\ initrd\=initrd.img/append initrd=initrd.img\
  ks\=cdrom:\/ks.cfg/' bootisoks/isolinux/isolinux.cfg

In other works, add “ks=cdrom:/ks.cfg” to the “append” statements in that file.

Special Gotcha: I ran into serious problems when I created my first image because I gave my DVD a custom volume label (using the -V on the mkisofs command). This proved fatal, because the “append” statements refer to the install media by its label and when the installer could not find a volume with that label. Which resulted in the following cryptically useless installation output:

Starting dracut initqueue hook….

At which point the whole thing would hang forever. Changing the volume ID in isolinux.cfg fixed that.

Step 4 Build the image

At this point, we’ve installed and activated our custom kickstart and setup the packages and repodata. If you have any custom scripts or data files to add the the image, do it now. And then build the ISO file, like so:

cd bootisoks
mkisofs -o ../boot.iso -b isolinux.bin -c boot.cat -no-emul-boot -boot-load-size 4 -boot-info-table -V "CentOS 7 x86_64" -R -J -v -T isolinux/. .
cd ..

You can make the ISO bootable from a thumb drive by doing this:

isohybrid boot.iso

And finally, add the checksum so that media testing will work properly.

implantisomd5 boot.iso

At this point, you should be able to burn the iso to DVD or “dd” it to a USB media device. Happy booting!

If you have problems:

This posting has attempted to correct and amplify what I have learned elsewhere. But of course, it’s likely to have introduced a few errors of its own. Because I don’t want to have to deal with spammers and abusers, this website doesn’t allow comments, but if you have questions or comments, I can be contacted through the Linux forum at http://coderanch.com

References:

https://serverfault.com/questions/517908/how-to-create-a-custom-iso-image-in-centos

https://www.frankreimer.de/?p=522

cloud-init “gotcha”

I was putting together a project using The Foreman to spin up and manage Amazon EC2 instances and ran into a problem. I could take an AMI and launch it, but I couldn’t ssh into it.

One of the major reasons why this was so was that cloud-init was silently failing and as a result, my ssh key wasn’t being installed.

The AMIs in question were built on top of Ubuntu 14.04LTS. The Foreman creates its own private access key to launch and control EC2 instances and it’s not accessible for general use. You have to supply your own private key if you want to login via ssh.

The recommended way to do that is to inject it via cloud-init. However, cloud-init wasn’t working right.

After a lot of experimentation, I discovered that the issue was in the attempt to also use cloud-init to set the simple hostname and fqdn of the newly-created host. Turns out that including the “host:” line like all the samples out on the Internet do causes the ENTIRE cloud-init file to be ignored. So in order to get my ssh key (AND hostname) injected, I just needed the “fqdn:” line.

Baby Steps with OpenStack

The OpenStack cloud platform is hot these days. Anyone can set up and run their own private cloud without too much difficulty.

Relatively speaking. You do need a huge chunk of RAM and a respectable amount of disk space, even for a minimal cloud. Also a x64-bit hardware VM capable CPU. But considering what you get, it’s not a bad payoff.

An openstack cloud has 3 types of nodes: control, compute, and storage. You need at least one of each, but a single OS instance can host any combination of them, so the simplest cloud would be an all-in-one server.

First Step

There are quite a number of components that make up these nodes – including some that are plug-replaceable, so the easiest way to get started is to use a springboard. One popular route uses a Vagrant VM to launch the DevStack ready-to-run server. This is a good way to get familiar with OpenStack, since everything’s already pretty much set up and running and you can launch it via VirtualBox on your desktop. Assuming you have at least 8GB RAM to spare, since the DevStack VM is going to eat up about half of that.

Second Step

Running a cloud on your desktop is pretty cool, but if you have aspirations on running a real cloud, you need real servers. Since I didn’t have that many spare servers with sufficient capabilities, the next step I did was again launch OpenStack in a VM, but using a KVM under CentOS 5.11. Why not 6 or 7? Primarily because I have legacy Xen VMs on its siblings and I’m not yet ready to migrate them to an OS that can’t host Dom0.

If you do not allocate sufficient RAM or disk space for the OpenStack VM, it may not install properly and almost certainly won’t work properly, and for the most part you’re not going to get much in the way of helpful messages. OpenStack is comprised of a whole raft of component products and there’s not much in the way of centralized detection and reporting of broken components.

Here’s what I used to create the basic OpenStack VM:

#!/bin/sh
VM=icehouse
IMAGES=/var/lib/libvirt/images

virt-install --name $VM \
	--hvm --ram 4096 --cpus 4 \
	--disk path=$IMAGES/$VM.img,size=6 \
	--network bridge:br0 \
	--os-type=linux --os-variant=rhel6 \
	--accelerate --vnc -v \
	--location=http://10.0.1.3/cobbler/ks_mirror/Centos6.5_x86-64/ \
	-x "ks=http://10.0.1.3/cobbler/pub/$VM.ks"

Note that this VM runs the IceHouse release of RDO under Centos 6.5. I tried Juno and CentOS 7, but it kept whining about running out of memory, at least up to about 3GB or so. The network bridge is my VM host’s bridge to the VMs the number of CPUs isn’t important, but I had a few to spare. The kickstart file is nothing special, but it does install and enable ntp and format the disks into a /boot (about 300M) and an LVM partition (everything else), containing a single Logical Volume for the OS.

caution:

A production all-in-one node needs a LOT more disk. You’ll want storage for the client disk images and working permanent storage,

Using PackStack

The PackStack package makes the job of setting up an OpenStack node a lot easier. It fetches the various component packages and uses Puppet to install and configure them. It also creates an “answers” file so you can replay the installation, if needed.

Under CentOS, the easiest way to get things going is to run packstack. My Kickstart had a post-install command to install the Icehouse Yum repository:

rpm -ivh https://repos.fedorapeople.org/repos/openstack/openstack-icehouse/rdo-release-icehouse-4.noarch.rpm

So the sequence once the VM came up post-install went like this:

  1. Install YUM plugin to enforce precedence on repo search/fetch
    yum -y install yum-plugin-priorities
  2. Upgrade the OS
    yum -y upgrade
  3. Reboot to get the latest kernel
    reboot
  4. Install PackStack
    yum -y install openstack-packstack
  5. Run PackStack
    packstack --allinone --provision-demo=n

Once all that’s done, with luck, you can open up a web browser on the Openstack console.

Things that can go horribly wrong

The single biggest headache I’ve found with OpenStack is networking. Networking a collection of VMs is a major pain even without clouds. OpenStack raises the ante considerably, since you have 2 options for network stacks (legacy nova network or neutron), and all sorts of real and virtual device/network options. Which, if you’re not already well-read on the subject, you’ll have no clue which ones you should be using or how to set them up.

More on this later.

Beyond that, the most critical functions for OpenStack are the security/identity manager (keystone) and the messaging agent (defaults to rabbitmq, but replaceable). Without the identity manager, nothing can be accessed, without the messaging system, components cannot notify each other about important events. Fortunately, these two are less likely to screw up and appear to be easier to diagnose/repair.

The curse of the mad Puppet

I have been working with various things designed to allow me to control the mousetech.com domain assets in a more centralized way. One of them was to try and use Puppet to provision machines. Puppet is a fairly nice tool, but there are some unexpected pitfalls.

There are several ways to get puppet on a CentOS 5 server. If you’d a glutton for punishment, you can always pull down the source and build from scratch, but I don’t recommend that when just getting started. You can also pull it via YUM from the EPEL repository. Or, you can import the Puppet Labs repo in and pull from there.

I already have EPEL in my stock set of YUM repositories, so that’s what I went for first. In the beginning all was fun and games. Then I got more ambitions and started defining modules. This didn’t work. Worse, the sample documentation used commands that didn’t work. It was getting very frustrating.

It became obvious fairly early that some significant changes have been made to puppet and that what I had wasn’t the Latest and Greatest. That would have been OK, except that attempts to read the online documentation for the older stuff kept leaking back into docs for the newer stuff (a not uncommon problem, best handled I think by archiving the old docs as self-contained PDFs). On top of that, the version of puppet that I was running was sufficiently antique that much of the documentation had fallen off the website (see previous).

To add to the confusion, I wasn’t really sure WHICH version of puppet I was running, since their enterprise product doesn’t keep quite the same set of version numbers as their community version plus I suspected the version (2.6.18) in the EPEL RPM wasn’t indicative, either. I finally came to the conclusion that 2.6.18 – which is the version that Centos10 pulled actually correlates to the community 2.5 version, which is something like 2.3 in Enterprise versioning.

At this point, I went to the source: Puppet Labs and found out about their repo. Unfortunately a network-based RPM install failed for obscure reasons (I’m not sure whether I have lingering LAN issues or it’s them). Fortunately, I was able to wget and install locally. After which I was able to install a Version 3 Puppet, the documentation now matches the commands and module processing works the way they said it did.

One last fly in the ointment, though. It seems that nodes and classes share the same namespace. I was using the same name for my guinea-pug machine node and one of the packages it was trying out and while both node and component parsed, the actual execution was only done against the node – the package was silently ignored. I fixed this by changing the node name to its fully-qualified domain name.

 

[SOLVED] mail loops back to me (MX problem?) for virtual machine

Sometimes they just gang up on you.

I was migrating my sendmail server from a NAT address to a bridge address when it all started.

Xen has this really nasty habit of zapping your hardware MAC address if you don’t get the nat routing configure just right. There’s obviously some way to get it to revert, because occasionally for no obvious reason, the real MAC address will revert, but don’t try searching the web for an answer – all you’ll get is fruitless inquiries and flame responses (you shouldn’t be changing your MAC address, idiot!). Please. There are very good reasons why it’s useful to be able to set a custom MAC address. One place I worked coded their hardware asset IDs in the MAC to assist their DHCP server, for instance.

On the mousetech domain, I’d be happier if it didn’t happen. As it is, the MAC addresses of the primary and secondary NICs got swapped and I didn’t find out until I’d gotten most of the way through fixing things. So the former eth0 became eth1 and vice versa.

Shortly thereafter, outbound mail started bouncing with the infamous “mail loops back to me” message. Since I’d just done major relocation on the mail VM, I wasted a LOT of time messing around with sendmail options to no avail. Normally this message can be cured by putting in a valid MX record in DNS and/or adding all the mailserver alias names to the sendmail local-host-table (cw table). Not this time.

I was fairly sure that the problem had something to do with the fact that the physical host had been set up to forward all port 25 (smtp) requests to the mail VM and that somehow the wrong IP address was getting mixed in, but I could see the actual routing since it was all internal and specific to port 25 to boot.

Turns out I’d been sloppy when I fixed up the iptables forwarding. The correct version (after the NIC mixup) looks like this:

-A PREROUTING -i eth0 -p tcp --dport 25 -j DNAT --to-destination 10.0.1.8:25

Where I went wrong was in being lazy and omitting the NIC ID (eth0) when I repaired the damage that Xen did. As a result, BOTH NICs were being re-routed – the actual internet-facing NIC (which should be routed) and the internal DMZ bridge-facing NIC (which should not). As a result, traffic on port 25 for eth1 was being routed back on itself and sendmail complained.

Gourmet Recipe Manager problems under recent Fedora releases

The Gourmet Recipe Manager program is a very useful way to store and find recipes, but it has been essentially useless since about Fedora 11 (give or take).

The problem is that this app uses an SQLite database to keep its recipes and accesses the database using SQLAlchemy. Version 0.7 is seriously broken relative to Gourmet.

The cure, once you know it, is relatively easy – at least as long as you don’t have any other apps depending on SQLAlchemy (and I didn’t). First, check to see if you do:

rpm -q --whatrequires python-sqlalchemy

If gourmet is your only dependent (or you don’t care/feel brave), remove the 0.7 sqlalchemy package by brute force:

rpm --erase --nodep python-sqlalchemy

Download one of the 0.6 versions of sqlalchemy. If you have a 64-bit system, be sure to get the 64-bit RPM, which includes the 32-bit version.

Install the downloaded sqlalchemy RPM

rpm -ivh python-sqlalchemy-0.6.xxxxxx.rpm

Fire up gourmet. One of the easiest ways to see if it works is just to type a search. If it returns search results, you’re OK!

Making Apache mod_rewrite and the ajp Tomcat connector work together

Apache’s URL rewriting facility can be used to shape URLs piped to Tomcat, but it’s not as simple as it seems.

The basic action is straightforward. First incoming URLs are processed by mod_rewrite, then they are matched up against JkMount definitions to find the appropriate Tomcat connector to use.

The problem is, what Tomcat sees isn’t what you’d expect to have come out from the rewrite process.

What actually happens is that before being submitted to mod_rewrite, the incoming URL is converted to server-relative form. After rewriting, the resulting URL is then reassembled, including the parts that Tomcat doesn’t want or need. Including the Apache DocumentBase path.

To prevent this problem, just qualify the last rewrite rule in the process with the “PT” directive to let Apache know to “Pass Through” the URL without reassembly. So, for example:

RewriteRule ^/$ /mywebapp/index.jsp [L,PT]
JkMount /mywebapp/* ajp13

Howto: Installing TracDiscussion

TracDiscussion allows adding discussion forums to the Trac (http://edgewall.org) system. Unfortunately, there’s not a lot of info on how to to it.

In a perfect world, you’d just issue the command “easy_install TracDiscussion” and bam-you’re-done. In this world, not yet, anyway.

Installing TracDiscussion requires installing some prerequisite packages. If you’re running Trac 11 or later, you don’t need to install the web administraction package, since it’s now part of Trac. Yes, they say that, but rather ambiguously, so this is an unqualified statement. Secondly, you need a spam filter. You’re dealing with email, spam is now about 90% of all email, so you need a filter.

Installing the actual TracDiscussion plugin isn’t all that hard even lacking a functional direct easy_install. Just go to the SVN archives, pick the project directory whose name matches your trac version and use the Subversion export command to download it. Then run easy_install against the downloaded directory.

That’s only half the battle. At that point, restart your HTTP server to reload the Trac configuration. You should see the TracDiscussion plugin listed on the admin plugins page. Select all options. In particular, select the installation option. If you don’t select that one, the extra tables won’t get added to your database.

You’ll probably get an error page backin instructing you to run the trac-admin update command. Do that from a command shell. If using PostgreSQL it will probably tell you to disable database backup on the upgrade. Hopefully you’re doing regular PostgreSQL backups anyway, but you might want to kick off a manual PostgreSQL backup before doing the Trac upgrade just for good luck’s sake.

You may want to edit the [notifications ]section of your trac.ini file in order to configure email.Then restart HTTP. And backup the newly-updated database.

With luck, “discussion” will now appear as one of the items on your Trac menubar and there will be a Discussions management section on the sidebar menu of the admin page.

Trac: Permission denied connecting to PostgreSQL Server

Another one of those obscure things that a web search didn’t turn up. Using the “psql” command-line utility, it was possible to connect to and access the PostgreSQL Trac database. However, the webapp failed, with “cannot connect to server: Permission denied”. No amount of tweaking the firewalls or the pg_hba.conf file would help.

It turns out that the permission wasn’t PostgreSQL, it was selinux. It was forbidding psycopg from opening a database connection while running inside Apache. It wants a “allow httpd_t postgresql_port_t:tcp_socket name_connect;”.

A message shows up in /var/log/audit/audit.log – if you think to look there.