Puppetserver in a can —Docker versus Podman

Things have been too exciting here lately. Attempting to set up Ceph, having a physical machine’s OS get broken, having a virtual machine get totally trashed and rebuilt from the ground up.

It was this last that gave me an unexpected challenge. When I built the new container VM, I moved it from CentOS 7 to AlamLinux 8. Should be no problem, right?

I sicced Ansible on it to provision it — creating support files, network share connections… and containers.

One thing that’s different between CentOS 7 and CentOS (IBM Linux) 8 systems is that the actual container manager changed from Docker to Podman.

For the most part, that change is transparent to the point that if you give a Docker command, it is passed more or less verbatim to Podman.

But one container proved difficult: the Puppetserver. It steadfastly refused to come up. It claimed that it couldn’t find the cert it was supposed to have constructed. Or that the cert didn’t match itself, if you prefer.

Tried it with Podman on my desktop according to instructions. No problem. Changed my Ansible provisioning from “docker_container” to “podman_container”. No luck. Did an extra evil trick that allowed be to freeze startup so I could dial into the container. Cert directories were incomplete or empty!

I hacked deeper and deeper into the initialization process, but no enlightenment came. So finally I tried manually running the container with minimal options from the VM’s command line.

It still failed. Just for giggles, I did one more thing. Since Docker requires a root environment and I was trying to keep changes to a minimal, I was running Podman as root. I tried launching the Puppetserver container from an ordinaty user account.

And it worked!

I’m not sure why, although I suspect a couple of things. I believe that Pupper moved loctions on some of its credential files and was possibly counting on references to the old locations to redirect. Maybe they didn’t because some of this stuff is apparently using internal network operations and networking works differently in userspace.

At any rate, simply running the container non-rooted was all it took. Happiness abounds and I’ve got my recalcitrant server back online!

Ceph — Moving up

OK. I moved from Ceph Octopus to Pacific.

The reason I originally settled on Octopus was I was thinking that it was the newest release that had direct CentOS 7 support for my older servers. I was wrong. Actually CentOS 7 maxed out at Nautilus.

Octopus has been a problematic release. A bug kept scheduled cephadm tasks from executing if even the most trivial warnings existed. And trivial warnings were endemic, since there were major issues in trying to get rgw running. Caused, in part, I think, from a mish-mash of setups via older and cephadm component deployments. Certain directories are not in the same place between the two.

I also couldn’t remove failed rbs pool setups. They kept coming back after I deleted them, no matter how brutally I did so.

And finally, a major lock-up crept into the NFS server subsystem. Although it turns out that the ceph mount utility for Nautilus works just fine for me with Octopus servers so I was able to re-define a lot of NFS stuff to use Ceph directly.

Before I proceed, I should note that despite all my whinging and whining, the Ceph filesystem share has been rock solid. Much more so than Gluster was. Despite all the mayhem and random blunderings, the worst I ever did to the filesystem was trash performance when I under-deployed OSDs. Never did I have to wipe it all out and restore from backups.

OK. So I finally bit the bullet. I’d fortuitiously run across the instructions to level-up all the Ceph depoyments to be cephadm based the day before. Now I followed the simple upgrade instructions.

I had concerns because I had a number of failed rgw daemons I couldn’t get rid of, but actually the worst thing that happened was it spend all night flailing because an OSD had gone offline. Restarted that and things began to happen. Lots of progress messages. Occasional stalls (despite what one source told me it’s NOT desirable to have my Ganesha pool have no replicas!),

But amazingly it was done! Now deleted rgw dæmos stay deleted! I might even get them hooked into my VM deployments! And since one of Pacific’s improvements had to do with NFS, I hope Ganesha will be happy again.


I migrated from Gluster to Ceph a while back, since Gluster appears to be headed towards oblivion. It’s a shame, since Ceph is massive overkill for a small server farm like mine, but that’s life.

On the whole, Ceph is well-organized and well-documented. It does have some warts however:

  • The processes for installing and maintaining Ceph are still in flux as of Ceph Octopus, which is the most recent my CentOS 7 servers can easily support. Ny my count, we’ve gone through the following options:
    • Brute force RPM install (at least on Red Hat™-style systems)
    • Ansible install(deprecated)
    • ceph-deploy command. Apparently not included in my installation???
    • cephadm command. The presumed wave of the future and definitely an easy way to get bootstrapped
  • Certain error messages don’t convey enough information to understand or correct serious problems. For example:
  • monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2,1]. What does this even mean? How do I make it go away???
  • A number of systemd dæmons can fail with nothing more informative than “Start request repeated too quickly.” Often the best way to get usable (or sometimes useless) information on why the dæmon failed is to run the associated service manually and in the foreground. Especially since neither the system journal nor default log levels give any hints whatsoever.
  • Ceph handles the idea that a host can have multiple long and short names very poorly.
  • I have managed to severely damage the system multiple times, and I suspect a lot of it came from picking the wrong (obsolete) set of maintenance instructions even when I looked under a specific Ceph documentation release. Fortunately the core system is fairly bulletproof.

As a result, I’ve been posting information pages related to my journey. Hopefully they’ll be of help.

Waiting for a device: /dev/ttyUSB0

Quite a few people over the years have run into the problem, what to do when a service requires a device, but the service tries to go online before the device does?

One of the most common cases is if you have something like a display device driven by USB serial. That is, “lcddisplay.service” needs /dev/ttyUSB0.

Search the web for that and you’ll generally see uncomfortable hacks involving udev, but actually it can be much easier than that.

The systemd resource control subsystem for Linux can be configured to wait for filesystem resources using the resource pathname. So, for example, if I have a systemd mount for /export/usernet/, and I want to do things with files in that directory, I can set up the dependent service to be After=export-usernet.device”. Note that the “device” path is always absolute and that the slashes in the path are replaced with dashes, because otherwise the systemd resource pathname would appear to be a directory subtree. Incidentally, names with dashes in them double the dashes to avoid confusion with a slash-turned-dash.

However, things like /dev/ttyUSB0 are a little different, as they are managed not by the filesystem per se, but by udev – the hotswap manager. So does this trick still work?

Yes, it does! simply make your dependency on dev-ttyUSB0.device and you’re home free.

That doesn’t solve all your problems, since USB doesn’t always allow you to definitively determine which USB device will get which device ID. You can set up udev rules, but in the case of some devices like Arduinos, there is no unique serial number to map to, and in fact, sometimes unplugging and plugging can turn a USB0 into a USB1. But that’s a flaw in hardware and the OS can only do so much. Still, we can make it a little easier by doing this trick.

How to (really) Setup Bacula director on CentOS 7

I’ve seen quite a few examples of “how to setup bacula on CentOS7” and they all stink.

Mostly, they’re just re-warmed procedures from CentOS 6 or other distros, and they either don’t work or don’t work well.

I’m interested in setting up Bacula in a disaster recovery operation. I cannot afford wrong directions or directions that require elaborate procedures. So this is the quickest way to get the Bacula director set up and running using stock Centos 7 Resources.

The test platform used here was CentOS 7.6. Any earlier release of CentOS 7 should do as well. Later releases will probably work, but I make no guarantees for CentOS 8. After all, the CentOS 6 methods don’t work anymore!

In CentOS 6, there were 2 pre-configured bacula-director rpms. One was for using MySQL as the database backend, one was for using PostgreSQL as the backend. This changed in CentOS 7. In CentOS 7 there is only one director package and it’s defaulting to PostgreSQL.

So here’s the procedure:

Install Bacula Director

  1. Use RPM or Yum to install package bacula-dir. You’ll probably also want to install bacula-storage as well, since you cannot backup/restore without one!
  2. Configure the /etc/bacula/bacula-dir.conf file to set the passwords (hint, search for “@@”). Change the bacula-sd.conf and bconsole.conf files so that their passwords match up to the director passwords you chose.
  3. Also install and configure bacula-client package on whatever target machine(s) you are going to run bacula against.
  4. That should do for the moment. You’ll probably want to define bacula client profiles and jobs eventually.

Install PostgreSQL Server

  1. Use RPM or Yum to install package postgresql-server.
  2. Use this command to initialize the database:
postgresql-setup initdb

3. Start the postgresql server

systemctl enable postgresql
systemctl start postgresq

Initialize the Bacula database

This is the trickiest part. For one thing, postgresql doesn’t like you running a postgresql client as root. So the solution is to sudo in the opposite direction:

sudo -u postgres -i
# cd to avoid having server whine about root directory
# Now run the builder scripts IN THIS ORDER

^D (to exit sudo)

That’s it! (re)start bacula-dir and you should be able to connect with bconsole.

Special note on bacula scripts:

If you look at the /usr/lib/libexec/bacula/ directory, you’ll see a set of generic scripts (create_database, make_tables, grant_privileges). These are generic scripts and about all they actually do is determine which database type you’re set up for and invoke the specific setup scripts. However there is some extra lint in there and in my system it caused the generic scripts to complain. So I recommend invoking the postgresql scripts directly.

CMake – a short history

In the Beginning, if you wanted to create an executable application in C, you’d run a program that translated C code to assembly language, run an assembler on the translated code, then run a linker to bundle it all up into an executable file. And it was Good.

But it was also annoying after a while, so someone (probably Dennis Ritchie or one of his friends) wrote a wrapper compiler – the “cc” program. Now with a single command, you could compile, assemble and link. Incidentally, linking for most other systems (which may or may not have compiled to assembler) was generally a separate program. And it was Good.

But as projects got larger, with a dozen or more source modules, something more was needed. Generally, when you’re working on a large project, you’re only changing a small number of files at a time, and back when a Raspberry Pi would have been welcomed as a virtual super-computer, you didn’t want to continually re-compile stuff that hadn’t changed. So Stuart Feldman created the “make” utility. Make could not only intelligently choose what to compile, it provided a central control point with user-designable goals like “make clean”, “make install” and “make all”. And it was Good.

But Unixes came in many flavors. And eventually, that Included Linux. With multiple distros of its own. And building for each of these disparate platforms often required slightly different rules. And more and more often, there were external dependencies. As a result, David Mackenzie of the Free Software Project developed automake, which extended to become the Gnu Build Systems, or as it’s often known, Autotools. With this framework, you could build Unix/Linux apps portable using 3 magic commands: “./configure; make; (sudo) make install”. And it was Good.

But that still didn’t cover everything. Although the Microsoft Windows™ Operating system (among others) is very different platform, many open-source app authors wanted to be able to use a common source system to be able to build both Windows™ and Unix/Linux versions of their apps. So a team of people developed CMake, which takes over the capabilities of Autotools and adds OS portability to the mix. With CMake, you can not only compile on many platforms, you can cross-compile between platforms. Plus, you can build to external target directories, which allows both support for multiple targets at once and an easy and safe way to blank out a build completely and start over. With CMake, you can do a build as simply as ‘cmake -S source_dir -B build-dir; cd build_dir; make; (sudo) make install)”. And if you script it right, even the “make install” can be automated. And it is Very Good.

It doesn’t stop there. A shop that does nightly builds might use something like Jenkins to further manage and track builds, but CMake is the place where you go if you want to build manually. If, for example, you’re the developer doing the code editing.

Train Wreck. How nemo-desktop trashed both my local machine and the LAN

For some time now, I’ve been having problems where the power-save (suspend) feature of my desktop system has failed to put the machine to sleep. In some cases, in fact, the entire machine became black-screen unresponsive.

Examination of the system logs indicated that the suspend function was failing to freeze a number of tasks, thereby aborting the suspend. Indications were that it related to the glusterfs fuse client, and some tweaking of the gluster client and servers to upgrade protocol versions did help, it appeared, but only temporarily.

The other thing that I didn’t like was that the nemo-desktop task was eating one of my cores alive. I actually removed the entire cinnamon system and re-installed it, but that didn’t help. I considered moving back to gnome, but I need those monitoring widgets that gnome 3 in its arrogance dropped, and I keep a lot of icons on the desktop for quick access to hot project resources.

As it happened, I botched a cron definition on a long-running backup job, the server started launching multiple instances of it, and the gluster system took over the LAN. I fixed that, but noticed that gluster was still doing a ton of traffic to my desktop system.

And, incidentally, nemo-desktop response was painfully slow even just to pop up menus. But not regular file-explorer nemo. Only the desktop!

Digging into the toolkit (and google), I found to my horror that for some reason, nemo-desktop was opening, reading, and closing files over and over and over again. And among the files it was chowing down on were a handful of shortcuts (softlinks) to resource out on the glusterfs filesystem.

I deleted all of those links and suddenly all was calm. My network activity dropped to a whisper, and, strangely, nemo-desktop seemed to stop iterating through desktop files. So I’ve lost some convenience, but now the system performs much better (although nemo-desktop still reacts somewhat sluggishly). And power-save features now work reliably.

Creating a Functional Custom CentOS Install DVD

There are several existing How-To’s out on the Internet on the subject of creating a custom CentOS installation DVD/USB storage medium. But unfortunately, actually trying to employ them can be frustrating. So here goes with Yet Another How-To that I hope will fill some of the holes.

Why Custom media?

Why even bother with a custom install? Why indeed. The mousetech.com server farm is a fairly typical in-house setup. It has extensive provisioning capabilities, daily backups, filesystem mirroring and failovers for High Availability. If a server dies, it’s relatively easy to reconstruct it.

But what if a meteor hits the server complex or war breaks out and I have to flee to Argentina? How do I minimize the time and effort required to reconstruct the essential frameworks?

One way is to define a master bootstrap server that can be used to rebuild the main provisioning systems. The master bootstrap doesn’t run in normal operations. It’s independent of them. The normal servers distribute their functions among many machines, VM’s and containers, but the master bootstrap compacts their core functions down onto one temporary machine.

I could do the master bootstrap functions via a stock Centos install DVD set and a kickstart file, but by creating a custom install with the essential packages and kickstart (and customization scripts/data), I can make this a completely unattended operation. And when things are in total disaster mode, the less I need to remember to do, the better.

One note. An unfortunate consequence of the curent CentOS and related OS distros is that the old reliable convention of expecting there to be an eth0 device to network through is pretty much shot. I don’t assume any particular physical machine to be the target of this install, and therefore cannot predict what names the installation will assign to the network ports. So rather than find false comfort, I leave actual NIC setup to manually configuring the /etc/sysconfig/network-scripts after the installation has taken place. Similarly, since I install with no known network or gateway, everything is self-contained – no external servers.

And with that, I begin.

Step 1 – Build a workspace

Making a DVD requires a lot of disk space. So find a place on your build system with lots of room and create a workspace directory. We’ll mostly work out of there. For convenience, I’m going to call this directory “buildiso” and so it’s going to have a pathname of something like /home/timh/buildiso.

cd to this directory. All relative paths given in this howto are relative to this directory.

Step 2 – Start with an existing image

Creating an installation CD from scratch is a monumental task and not worth the effort. So rather than do that, let’s do what everyone else does and modify an existing image. Because this is panic recovery and I want it all on a single medium, I’m going to use the CentOS minimal image and build on it.

So to begin, we make a file-by-file copy of the DVD image. If you have a physical DVD mounted you should be able to copy that. If you have an iso image, mount it like so:

cd /home/timh/buildiso
mkdir mnt     # this is where we'll mount the source ISO file (if we use one)

mkdir bootisoks # this is where we build our new ISO image
mount -t iso9660 -o loop /path/to/centos-disc.iso mnt

Copy the source files from your mounted DVD or loop mount into the bootiso directory. You can use the Linux cp command, rysnc, or whatever you like as long as it copies all files and directories, including the hidden ones.

Once you’ve copied the files, you can unmount the ISO (or DVD). You don’t need it anymore unless you have to go back to the beginning.

Now that we have our model files, change them to be writeable so we can play with them:

chmod -R u+w bootisoks

Create a kickstart file and copy it to the iso image isolinux directory:

cp my_ks.cfg bootisoks/isolinux/ks.cfg

This will end up in the root of the actual DVD we’re creating.

Add any additional RPMs we want to the workspace iso package directory. These can be additional stock RPMs, third-party RPMs or your own custom-built RPMs. They all go directly into the bootisoks/Packages directory.

The CD install process for CentoOS 7 uses yum and the yum repository it uses is stored on the disc itself. Part of the repository infrastructure is the repodata directory and as a precaution, you should make a backup copy of it.

Most of the files in repodata have long twisty names designed to help mirrors keep in sync when distributed. We actually only care about one of them, so we’ll steal that one for later use:

cp bootisoks/repodata/*-comps.xml comps.xml

Gotcha #1

This file defines the installation groups, including the most critical one of all, which is base. So you’re going to need to inject that into the process of building the updated repodata. If you do not, the Linux installer will whine and fail.

Here’s how to properly reconstruct the repodata:

cd bootisoks

rm -rf repodata

createrepo -g ../comps.xml -dp .

cd .. # return to our workspace directory

Gotcha #2

It is critical that the newly-created repodata be pointing to the correct location for the RPMs that will be installed, which is to say the Packages directory. To verify that this happened, you can use this command:

less bootisoks/repodata/*-primary.xml.gz

This is a compressed file, but “less” is helpful and will display it uncompressed.

What you need to see in the ‘package type=”rpm”‘ elements are location sub-elements that look like this:

 <location href="Packages/NetworkManager-glib-1.12.0-6.el7.x86_64.rpm"/>

If you don’t see “Packages/” in the location, then yum won’t look in the DVD’s Packages directory, and it won’t find the RPMs. It will whine profusely and the install will fail.

This is why, contrary to some examples, you should run the createrepo program from the bootisoks directory and not the Packages directory. It will presumably scan (and add) any other rpms it finds in the bootisoks tree, but since there shouldn’t be any, that’s OK. If there’s an option to get the proper location without scanning everything, the createrepo documentation is too vague and my experiments haven’t been productive. Although here’s something I don’t think I tried (Note: I tried. It failed.):

createrepo -g ../comps.xml -dp Packages

Check the primary.xml.gz file and if the location is correct, use that.

The other “gotcha” on package installation can come if you omitted a pre-requisite package needed by one of the packages you are installing. The Linux installer packaging log will name any missing packages, in which case add them to the Packages directory, delete and rebuild the repodata and try again.

Step 3 – Activate the Kickstart file

To use your custom kickstart file, you need to define it to the bootloader directives in the isolinux directory. This is basically a grub menu file named isolinux/isolinux.cfg and you can use sed to update the different boot options in a single swoop like this:

sed -i 's/append\ initrd\=initrd.img/append initrd=initrd.img\
  ks\=cdrom:\/ks.cfg/' bootisoks/isolinux/isolinux.cfg

In other works, add “ks=cdrom:/ks.cfg” to the “append” statements in that file.

Special Gotcha: I ran into serious problems when I created my first image because I gave my DVD a custom volume label (using the -V on the mkisofs command). This proved fatal, because the “append” statements refer to the install media by its label and when the installer could not find a volume with that label. Which resulted in the following cryptically useless installation output:

Starting dracut initqueue hook….

At which point the whole thing would hang forever. Changing the volume ID in isolinux.cfg fixed that.

Step 4 Build the image

At this point, we’ve installed and activated our custom kickstart and setup the packages and repodata. If you have any custom scripts or data files to add the the image, do it now. And then build the ISO file, like so:

cd bootisoks
mkisofs -o ../boot.iso -b isolinux.bin -c boot.cat -no-emul-boot -boot-load-size 4 -boot-info-table -V "CentOS 7 x86_64" -R -J -v -T isolinux/. .
cd ..

You can make the ISO bootable from a thumb drive by doing this:

isohybrid boot.iso

And finally, add the checksum so that media testing will work properly.

implantisomd5 boot.iso

At this point, you should be able to burn the iso to DVD or “dd” it to a USB media device. Happy booting!

If you have problems:

This posting has attempted to correct and amplify what I have learned elsewhere. But of course, it’s likely to have introduced a few errors of its own. Because I don’t want to have to deal with spammers and abusers, this website doesn’t allow comments, but if you have questions or comments, I can be contacted through the Linux forum at http://coderanch.com




cloud-init “gotcha”

I was putting together a project using The Foreman to spin up and manage Amazon EC2 instances and ran into a problem. I could take an AMI and launch it, but I couldn’t ssh into it.

One of the major reasons why this was so was that cloud-init was silently failing and as a result, my ssh key wasn’t being installed.

The AMIs in question were built on top of Ubuntu 14.04LTS. The Foreman creates its own private access key to launch and control EC2 instances and it’s not accessible for general use. You have to supply your own private key if you want to login via ssh.

The recommended way to do that is to inject it via cloud-init. However, cloud-init wasn’t working right.

After a lot of experimentation, I discovered that the issue was in the attempt to also use cloud-init to set the simple hostname and fqdn of the newly-created host. Turns out that including the “host:” line like all the samples out on the Internet do causes the ENTIRE cloud-init file to be ignored. So in order to get my ssh key (AND hostname) injected, I just needed the “fqdn:” line.

Baby Steps with OpenStack

The OpenStack cloud platform is hot these days. Anyone can set up and run their own private cloud without too much difficulty.

Relatively speaking. You do need a huge chunk of RAM and a respectable amount of disk space, even for a minimal cloud. Also a x64-bit hardware VM capable CPU. But considering what you get, it’s not a bad payoff.

An openstack cloud has 3 types of nodes: control, compute, and storage. You need at least one of each, but a single OS instance can host any combination of them, so the simplest cloud would be an all-in-one server.

First Step

There are quite a number of components that make up these nodes – including some that are plug-replaceable, so the easiest way to get started is to use a springboard. One popular route uses a Vagrant VM to launch the DevStack ready-to-run server. This is a good way to get familiar with OpenStack, since everything’s already pretty much set up and running and you can launch it via VirtualBox on your desktop. Assuming you have at least 8GB RAM to spare, since the DevStack VM is going to eat up about half of that.

Second Step

Running a cloud on your desktop is pretty cool, but if you have aspirations on running a real cloud, you need real servers. Since I didn’t have that many spare servers with sufficient capabilities, the next step I did was again launch OpenStack in a VM, but using a KVM under CentOS 5.11. Why not 6 or 7? Primarily because I have legacy Xen VMs on its siblings and I’m not yet ready to migrate them to an OS that can’t host Dom0.

If you do not allocate sufficient RAM or disk space for the OpenStack VM, it may not install properly and almost certainly won’t work properly, and for the most part you’re not going to get much in the way of helpful messages. OpenStack is comprised of a whole raft of component products and there’s not much in the way of centralized detection and reporting of broken components.

Here’s what I used to create the basic OpenStack VM:


virt-install --name $VM \
	--hvm --ram 4096 --cpus 4 \
	--disk path=$IMAGES/$VM.img,size=6 \
	--network bridge:br0 \
	--os-type=linux --os-variant=rhel6 \
	--accelerate --vnc -v \
	--location= \
	-x "ks=$VM.ks"

Note that this VM runs the IceHouse release of RDO under Centos 6.5. I tried Juno and CentOS 7, but it kept whining about running out of memory, at least up to about 3GB or so. The network bridge is my VM host’s bridge to the VMs the number of CPUs isn’t important, but I had a few to spare. The kickstart file is nothing special, but it does install and enable ntp and format the disks into a /boot (about 300M) and an LVM partition (everything else), containing a single Logical Volume for the OS.


A production all-in-one node needs a LOT more disk. You’ll want storage for the client disk images and working permanent storage,

Using PackStack

The PackStack package makes the job of setting up an OpenStack node a lot easier. It fetches the various component packages and uses Puppet to install and configure them. It also creates an “answers” file so you can replay the installation, if needed.

Under CentOS, the easiest way to get things going is to run packstack. My Kickstart had a post-install command to install the Icehouse Yum repository:

rpm -ivh https://repos.fedorapeople.org/repos/openstack/openstack-icehouse/rdo-release-icehouse-4.noarch.rpm

So the sequence once the VM came up post-install went like this:

  1. Install YUM plugin to enforce precedence on repo search/fetch
    yum -y install yum-plugin-priorities
  2. Upgrade the OS
    yum -y upgrade
  3. Reboot to get the latest kernel
  4. Install PackStack
    yum -y install openstack-packstack
  5. Run PackStack
    packstack --allinone --provision-demo=n

Once all that’s done, with luck, you can open up a web browser on the Openstack console.

Things that can go horribly wrong

The single biggest headache I’ve found with OpenStack is networking. Networking a collection of VMs is a major pain even without clouds. OpenStack raises the ante considerably, since you have 2 options for network stacks (legacy nova network or neutron), and all sorts of real and virtual device/network options. Which, if you’re not already well-read on the subject, you’ll have no clue which ones you should be using or how to set them up.

More on this later.

Beyond that, the most critical functions for OpenStack are the security/identity manager (keystone) and the messaging agent (defaults to rabbitmq, but replaceable). Without the identity manager, nothing can be accessed, without the messaging system, components cannot notify each other about important events. Fortunately, these two are less likely to screw up and appear to be easier to diagnose/repair.