How to run a container that crashes on startup

one of the most frustrating things about running with containers is when a container fails immediately on startup.

This can happen for a number of reasons, and not all of them record errors in logfiles to help diagnose them.

However, there’s a fairly easy way to get around this.

Start the container with the “interactive” option and override the “entrypoint” option to execute “/bin/sh”. This will do two things.

  1. Instead or running the normal container startup, it will start the container executing the command shell
  2. The “interactive” option holds the container open. Without it, the command shell sees an immediate end-of-file and shuts down the container.

At this point, you can then use the docker/podman “exec” command to dial in to the container like so:

docker exec my_container /bin/sh

At that point, you can inspect files, run utilities, and do whatever is necessary to diagnose/repair the image.

An additional help is also available once you have a tame container running. You can use the docker/podman “cp” command to copy files into and out of the container. Many containers have minimal OS images and have neither an installed text editor nor a package installer to install a text editor. So you can pull a faulty file out of a container, fix it, and push it back. The changes will persist as long as you restart the container and don’t start a new instance from the original image.

Puppetserver in a can —Docker versus Podman

Things have been too exciting here lately. Attempting to set up Ceph, having a physical machine’s OS get broken, having a virtual machine get totally trashed and rebuilt from the ground up.

It was this last that gave me an unexpected challenge. When I built the new container VM, I moved it from CentOS 7 to AlamLinux 8. Should be no problem, right?

I sicced Ansible on it to provision it — creating support files, network share connections… and containers.

One thing that’s different between CentOS 7 and CentOS (IBM Linux) 8 systems is that the actual container manager changed from Docker to Podman.

For the most part, that change is transparent to the point that if you give a Docker command, it is passed more or less verbatim to Podman.

But one container proved difficult: the Puppetserver. It steadfastly refused to come up. It claimed that it couldn’t find the cert it was supposed to have constructed. Or that the cert didn’t match itself, if you prefer.

Tried it with Podman on my desktop according to instructions. No problem. Changed my Ansible provisioning from “docker_container” to “podman_container”. No luck. Did an extra evil trick that allowed be to freeze startup so I could dial into the container. Cert directories were incomplete or empty!

I hacked deeper and deeper into the initialization process, but no enlightenment came. So finally I tried manually running the container with minimal options from the VM’s command line.

It still failed. Just for giggles, I did one more thing. Since Docker requires a root environment and I was trying to keep changes to a minimal, I was running Podman as root. I tried launching the Puppetserver container from an ordinaty user account.

And it worked!

I’m not sure why, although I suspect a couple of things. I believe that Pupper moved loctions on some of its credential files and was possibly counting on references to the old locations to redirect. Maybe they didn’t because some of this stuff is apparently using internal network operations and networking works differently in userspace.

At any rate, simply running the container non-rooted was all it took. Happiness abounds and I’ve got my recalcitrant server back online!