Podman and chmod frustrated?

In theory, Podman is “just like” Docker. In practice, of course, there are a couple of big differences. Some have to do with networking, and those are relatively easy to solve. A bigger one has to do with Podma’s ability to run rootless.

Rootless operation means that you don’t have to have root privileges to run a container. Also, it means that you’ve got an extra level of security, since running under a non-root account limits what invaders can hack into.

Where it gets really frustrating is when you try and run a container that does things with file ownership and rights on a mounted volume.

It’s not uncommon, especially when using a container built for Docker that the container wants to create and/or chown directories as part of its initial setup. That doesn’t work too well when running rootless. It can and probably will run afoul of the Linux OS file protection systems in one of two ways.

selinux violation. Oddly, I’ve had containers fail due to selinux violations even though the host OS had selinux running in Permissive mode (Almalinux 9). No explanation has been found, but that’s how it is. You can add custom selinux rules to the host environment to permit it, but that will likely drop you to the other way:

Operation not allowed. Even though the active user inside the container is root, it cannot chown files/directories in mounted volumes.

Not allowed? But I’m root!

Well, yes, but only withoin your tiny little kingdom.

Now think of what happens when you mount an external volume. Maybe its an NFS fileshare with files for many different users on it, each with their own user directories. Maybe you can read other user’s files, maybe not, depending on the rights you’ve been granted.

That’s how it looks to the host OS user accout. Logged in as the host user.

But now let’s start a container which runs as its own root. If the usual root rules applied, that container could run wild over the external filesystem tree mounted to it. That would completely negate the protections of the host’s user account!

So, instead, the container’s root is prohibited from doing things that it couldn’t do as a user outside the container.

But what about subuids?

At first glance, it seems like you might be able to work around this problem using subuids. But nope. The subuid facility aliases user/group IDs inside the container to alternative user/group IDs outside the container based on the subuid maps. That’s because a container is a mini-vm and thus can have its own private set of userids independent of the container host’s userids.

The full details of the subuid mapping can be found in podmain documentation, but in its basic form, userid 0 (root) inside the container converts to the rootless user’s userid and all other internal userids are converted to their external counterparts by adding an offset defined in the subuid map to them (for example, 10000, making userid 999 map to external userid 100998 (remember, 0 is root!)

Thus, though magic not documented in the current man pages or (as far as I know in podman), the “chown” command can chown to internal userids, but not to container host userids. Same for other attribute-changing operations.

Note that since the subuids are themselfs uids (though remapped) in the container host, they also adhere to standard outside-the-container restrictions on chown and its friends. In fact, you can’t even do a directory listing on a subuid’ed directory unless you’ve been assigned rights to do so.

But assigning rights is normally controlled by the root user and it would be unfair to restrict access to what are essentially your own files and directories just because they have subuid’s! So that gives us:

Unshare

The podman unshare command effectively undoes the uid remapping. It can be used to execute a single command or invokes to start a podman unshare shell.

Inside unshare, “whoami” changes from your userid to root and shows you your internal userids without the remapping. Thus, you can do all the necessary file operations wuthout actually becoming root. Aside from hiding the remapping, unshare also is a more limited root than sudo. For example, you cannot list the contents of the OS /etc/shadow file, nor should you be able to look into/alter the files and directories of other users.

Volumes

Last, but not least, there’s another way to have chrootable directories. The Podman (Docker) volume allows creation of non-volatile storage that’s integrated with the container system. Meaning that userids of assets within the volume should not be affected as they are sealed off from the rest of the filesystem.

Volumes were always convenient, but especially when using quadlets to manage containers. When I manually started containers, I often had data stores within the container image and thus information was “permanent” as long as I used that image. But quadlets destroy and re-create containers, so it’s not possible to do that. Instead, put the non-volatile data in a volume (which can be quadelet-managed) and attach the volume to your container. Solves the potential for data loss and makes it easier to make containers elastic.

How to run a container that crashes on startup

one of the most frustrating things about running with containers is when a container fails immediately on startup.

This can happen for a number of reasons, and not all of them record errors in logfiles to help diagnose them.

However, there’s a fairly easy way to get around this.

Start the container with the “interactive” option and override the “entrypoint” option to execute “/bin/sh”. This will do two things.

  1. Instead or running the normal container startup, it will start the container executing the command shell
  2. The “interactive” option holds the container open. Without it, the command shell sees an immediate end-of-file and shuts down the container.

At this point, you can then use the docker/podman “exec” command to dial in to the container like so:

docker exec my_container /bin/sh

At that point, you can inspect files, run utilities, and do whatever is necessary to diagnose/repair the image.

An additional help is also available once you have a tame container running. You can use the docker/podman “cp” command to copy files into and out of the container. Many containers have minimal OS images and have neither an installed text editor nor a package installer to install a text editor. So you can pull a faulty file out of a container, fix it, and push it back. The changes will persist as long as you restart the container and don’t start a new instance from the original image.

Puppetserver in a can —Docker versus Podman

Things have been too exciting here lately. Attempting to set up Ceph, having a physical machine’s OS get broken, having a virtual machine get totally trashed and rebuilt from the ground up.

It was this last that gave me an unexpected challenge. When I built the new container VM, I moved it from CentOS 7 to AlamLinux 8. Should be no problem, right?

I sicced Ansible on it to provision it — creating support files, network share connections… and containers.

One thing that’s different between CentOS 7 and CentOS (IBM Linux) 8 systems is that the actual container manager changed from Docker to Podman.

For the most part, that change is transparent to the point that if you give a Docker command, it is passed more or less verbatim to Podman.

But one container proved difficult: the Puppetserver. It steadfastly refused to come up. It claimed that it couldn’t find the cert it was supposed to have constructed. Or that the cert didn’t match itself, if you prefer.

Tried it with Podman on my desktop according to instructions. No problem. Changed my Ansible provisioning from “docker_container” to “podman_container”. No luck. Did an extra evil trick that allowed be to freeze startup so I could dial into the container. Cert directories were incomplete or empty!

I hacked deeper and deeper into the initialization process, but no enlightenment came. So finally I tried manually running the container with minimal options from the VM’s command line.

It still failed. Just for giggles, I did one more thing. Since Docker requires a root environment and I was trying to keep changes to a minimal, I was running Podman as root. I tried launching the Puppetserver container from an ordinaty user account.

And it worked!

I’m not sure why, although I suspect a couple of things. I believe that Pupper moved loctions on some of its credential files and was possibly counting on references to the old locations to redirect. Maybe they didn’t because some of this stuff is apparently using internal network operations and networking works differently in userspace.

At any rate, simply running the container non-rooted was all it took. Happiness abounds and I’ve got my recalcitrant server back online!