Podman and chmod frustrated?

In theory, Podman is “just like” Docker. In practice, of course, there are a couple of big differences. Some have to do with networking, and those are relatively easy to solve. A bigger one has to do with Podma’s ability to run rootless.

Rootless operation means that you don’t have to have root privileges to run a container. Also, it means that you’ve got an extra level of security, since running under a non-root account limits what invaders can hack into.

Where it gets really frustrating is when you try and run a container that does things with file ownership and rights on a mounted volume.

It’s not uncommon, especially when using a container built for Docker that the container wants to create and/or chown directories as part of its initial setup. That doesn’t work too well when running rootless. It can and probably will run afoul of the Linux OS file protection systems in one of two ways.

selinux violation. Oddly, I’ve had containers fail due to selinux violations even though the host OS had selinux running in Permissive mode (Almalinux 9). No explanation has been found, but that’s how it is. You can add custom selinux rules to the host environment to permit it, but that will likely drop you to the other way:

Operation not allowed. Even though the active user inside the container is root, it cannot chown files/directories in mounted volumes.

Not allowed? But I’m root!

Well, yes, but only withoin your tiny little kingdom.

Now think of what happens when you mount an external volume. Maybe its an NFS fileshare with files for many different users on it, each with their own user directories. Maybe you can read other user’s files, maybe not, depending on the rights you’ve been granted.

That’s how it looks to the host OS user accout. Logged in as the host user.

But now let’s start a container which runs as its own root. If the usual root rules applied, that container could run wild over the external filesystem tree mounted to it. That would completely negate the protections of the host’s user account!

So, instead, the container’s root is prohibited from doing things that it couldn’t do as a user outside the container.

But what about subuids?

At first glance, it seems like you might be able to work around this problem using subuids. But nope. The subuid facility aliases user/group IDs inside the container to alternative user/group IDs outside the container based on the subuid maps. That’s because a container is a mini-vm and thus can have its own private set of userids independent of the container host’s userids.

The full details of the subuid mapping can be found in podmain documentation, but in its basic form, userid 0 (root) inside the container converts to the rootless user’s userid and all other internal userids are converted to their external counterparts by adding an offset defined in the subuid map to them (for example, 10000, making userid 999 map to external userid 100998 (remember, 0 is root!)

Thus, though magic not documented in the current man pages or (as far as I know in podman), the “chown” command can chown to internal userids, but not to container host userids. Same for other attribute-changing operations.

Note that since the subuids are themselfs uids (though remapped) in the container host, they also adhere to standard outside-the-container restrictions on chown and its friends. In fact, you can’t even do a directory listing on a subuid’ed directory unless you’ve been assigned rights to do so.

But assigning rights is normally controlled by the root user and it would be unfair to restrict access to what are essentially your own files and directories just because they have subuid’s! So that gives us:

Unshare

The podman unshare command effectively undoes the uid remapping. It can be used to execute a single command or invokes to start a podman unshare shell.

Inside unshare, “whoami” changes from your userid to root and shows you your internal userids without the remapping. Thus, you can do all the necessary file operations wuthout actually becoming root. Aside from hiding the remapping, unshare also is a more limited root than sudo. For example, you cannot list the contents of the OS /etc/shadow file, nor should you be able to look into/alter the files and directories of other users.

Volumes

Last, but not least, there’s another way to have chrootable directories. The Podman (Docker) volume allows creation of non-volatile storage that’s integrated with the container system. Meaning that userids of assets within the volume should not be affected as they are sealed off from the rest of the filesystem.

Volumes were always convenient, but especially when using quadlets to manage containers. When I manually started containers, I often had data stores within the container image and thus information was “permanent” as long as I used that image. But quadlets destroy and re-create containers, so it’s not possible to do that. Instead, put the non-volatile data in a volume (which can be quadelet-managed) and attach the volume to your container. Solves the potential for data loss and makes it easier to make containers elastic.

How to run a container that crashes on startup

one of the most frustrating things about running with containers is when a container fails immediately on startup.

This can happen for a number of reasons, and not all of them record errors in logfiles to help diagnose them.

However, there’s a fairly easy way to get around this.

Start the container with the “interactive” option and override the “entrypoint” option to execute “/bin/sh”. This will do two things.

  1. Instead or running the normal container startup, it will start the container executing the command shell
  2. The “interactive” option holds the container open. Without it, the command shell sees an immediate end-of-file and shuts down the container.

At this point, you can then use the docker/podman “exec” command to dial in to the container like so:

docker exec my_container /bin/sh

At that point, you can inspect files, run utilities, and do whatever is necessary to diagnose/repair the image.

An additional help is also available once you have a tame container running. You can use the docker/podman “cp” command to copy files into and out of the container. Many containers have minimal OS images and have neither an installed text editor nor a package installer to install a text editor. So you can pull a faulty file out of a container, fix it, and push it back. The changes will persist as long as you restart the container and don’t start a new instance from the original image.

HOWTO: get Docker Containers under Centos 5 with Xen

Centos5 is getting long in the tooth, but then again, many of my servers are antiques that would find native Centos6 to be problematic.

A recent adventure in disaster recovery led me to upgrade several of my Xen DomU’s from CentOS 5 to CentOS 6, but I was distressed to discover that about the minimum you can get by with on RAM for CentOS6 is nearly 400MB. I wanted to host several CentOS6 VMs, but the thought of getting dinged to the tune of half-a-GByte of RAM plus several gigs of disk image didn’t sit well for lightweight systems.

The “in” thing for this kind of stuff is Containers, which neatly fit in the space between a full VM and something less capable such as a chroot jail. The question was, could I get CentOS 6 containers to work in a CentOS5 Dom0?

As a matter of fact, yes, and it was considerably less painful than expected!

I cheated and did the real dirty work using my desktop machine, which is running Fedora 20, hence is better supported for all the bleeding-edge tools. Actually, Ubuntu would probably be even better, but I’m at home with what I’ve got and besides, the idea is to make it as little work as possible given my particular working environment.

Step 1: Vagrant.

Vagrant is one of those products that everyone says is wonderful (including themselves), but it was hard to tell what it’s good for. As it turns out, what it’s good for is disposable VM’s.

Specifically, Vagrant allows the creation of VM “boxes” and the management of repositories of boxes. A “box” is a VM image plus the meta-data needed for Vagrant to deploy and run the VM.

So I yum-installed vagrant on my Fedora X86_64 system.

My selected victim was a basic CentOS 6 box, since for the VirtualBox VM environment.

vagrant box add centos65-x86_64-20131205 https://github.com/2creatives/vagrant-centos/releases/download/v6.5.1/centos65-x86_64-20131205.box

Step 2. Docker

It would have been more convenient to get a ready-made Centos6 Docker box, but most Docker-ready boxes in the public repo are for Ubuntu. So I did a “vagrant up” to cause the box image to download and launch, connected to the Centos6 guest, and Docker-ized it using this handy resource: http://docs.docker.io/installation/rhel/

An alternative set of instructions:

http://cloudcounselor.com/2013/12/05/docker-0-7-redhat-centos-6-5/

The process is rather simple as long as you’re using the latest CentOS 6.5 release. Older kernels don’t have the necessary extensions, requiring a kernel upgrade first.

Step 3. Porting to Xen

Once docker was working, the challenge of getting the VM from VirtualBox to Xen presented itself. I was expecting a nightmare of fiddling with the disk image and generating a new initrd, but there was a pleasant surprise. All I had to do was convert the VM image from the “vmdk” format to a raw disk image, transfer the raw image to the Xen box, hack up a xen config and it booted right up!

The details:

On the Fedora desktop:

$ qemu-img convert -f vmdk virtualbox-centos6-containers.vmdk -O raw centos6-containers.img
$ rsync -avz --progress centos6-containers.img root@vmserver:/srv/xen/images/centos6-container/

File and directory names vary depending on your needs and preferences.

Now it’s time to connect to “vmserver”, which is my CentOS5 physical host.

I stole an existing XEN DomU pygrub-booting definition from another VM, changed the network and virtual disk definitions to provide a suitable environment. The disk definition itself looks like this:

disk = [ "tap:aio:/srv/xen/images/centos6-container/centos6-containers.img,xvda,w"]

xvda, incidentally is a standard Centos VM disk image, with a swap and LVM partition inside.

I launched the VM and behold! a Centos 6 Docker container DomU on a CentOS 5 Dom0 base.

Everything should be this easy.