The support for User Namespaces in docker exists from version 1.10 onward and i consider it as one of the most important docker’s security enhancements.
What are User namespaces ?
User namespaces is feature of Linux fully implemented from kernel 3.12 onward that can be used to isolate User ID’s and Group ID’s between the host and containers. It can provide better isolation and security. For example a privileged user root in the container can be mapped to a normal user on the host.
By setting user namespaces in docker daemon, a process running inside a container as root will not be running as root on the docker host Operating System, which makes it harder for a rogue process to break out of the container.
Let’s dive deep into the problem which arises by not enabling user namespaces option in docker daemon.
# Run a container and mount host’s /etc onto /root/etc
$ docker run –rm -v /etc:/root/etc -it ubuntu
# Make some changes to /root/etc/hosts
root@34ef23438542:/# vi /root/etc/hosts
# Exit from the container
# Check /etc/hosts on host.
$ cat /etc/hosts
You can easily notice that the changes which you have done on container are exactly reflected back onto the host.
This is because the command
$ docker run –rm -v /etc:/root/etc -it ubuntu
should run the bash shell inside the container as root and simultaneously when you run ps -ef on the host you will notice
root 17889 17873 0 17:47 pts/0 00:00:00 /bin/bash
So in this scenario, rogue processes inside the container can do whatever they want on the host because they have access to root privileges on the host(actually root user on the host launches the bash shell in the container).
In order to avoid this kind of security loopholes in your system let’s activate User Namespaces.
# Create a user called “dockremap”
$ sudo adduser dockremap
The above username “dockremap” must exist in /etc/passwd file. You also need to have Sub UID and GID ranges specified in
/etc/subuid and /etc/subgid files repectively.
# Setup subuid and subgid
$ sudo sh -c ‘echo dockremap:500000:65536 > /etc/subuid’
$ sudo sh -c ‘echo dockremap:500000:65536 > /etc/subgid’
Also,in the above example I reserved a range of 65536 UIDs (the numbers in the subuid file are the starting UID and thenumber of UIDs available to that user) but Docker Engine will only use the first one in the range (for now, Engine is
only capable of remapping to a single UID).
copy the base systemd config file to /etc/systemd/system
sudo cp /lib/systemd/system/docker.service /etc/systemd/system/
In /etc/systemd/system/docker.service file edit
ExecStart=/usr/bin/docker daemon -H fd://
ExecStart=/usr/bin/docker daemon –userns-remap=dockremap -H fd://
and then restart docker with
sudo systemctl daemon-reload
sudo systemctl restart docker
The first thing you will notice at this point is that any images you had originally pulled will be gone.
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
Let investigate /var/lib/docker folder
$ sudo ls -lF /var/lib/docker/
drwx—— 10 165536 165536 4096 okt 24 13:32 165536.165536/
drwx—— 10 500000 500000 4096 okt 24 14:02 500000.500000/
drwxrwxrwx 5 root root 4096 okt 21 22:40 aufs/
drwxrwxrwx 8 root root 4096 okt 24 13:46 containers/
drwxrwxrwx 3 root root 4096 okt 21 22:40 image/
drwxrwxrwx 3 root root 4096 okt 21 22:40 network/
drwxrwxrwx 2 root root 4096 okt 21 22:40 swarm/
drwxrwxrwx 2 root root 4096 okt 24 13:41 tmp/
drwxrwxrwx 2 root root 4096 okt 21 22:40 trust/
drwxrwxrwx 5 root root 4096 okt 23 14:48 volumes/
Images are gone because the remapped docker engine operates in
500000.500000 directory [format XXX.YYY where XXX is the subordinate UID and YYY is the subordinate GID]not in the default directory.
$ sudo ls -F /var/lib/docker/500000.500000/
aufs/ containers/ image/ network/ swarm/ tmp/ trust/ volumes/
$ docker pull ubuntu
Using default tag: latest
latest: Pulling from library/ubuntu
Status: Image is up to date for ubuntu:latest
run a container
$ docker run -it ubuntu top
$ ps -ef | grep top
nobody 1289 800 0 12:00 ? 00:00:00 /usr/sbin/dnsmasq –no-resolv –keep-in-foreground –no-hosts –bind-interfaces –pid-file=/var/run/NetworkManager/dnsmasq.pid –listen-address=127.0.1.1 –cache-size=0 –conf-file=/dev/null –proxy-dnssec –enable-dbus=org.freedesktop.NetworkManager.dnsmasq –conf-dir=/etc/NetworkManager/dnsmasq.d
dilip 26915 30985 0 18:41 pts/3 00:00:00 docker run -it ubuntu top
500000 26955 26937 0 18:41 pts/0 00:00:00 top
dilip 31649 27127 0 18:44 pts/1 00:00:00 grep –color=auto top
Once the daemon is running with user namespacing enabled, the processes run as the remapped subordinate UID (in this case, 500000)instead of root.
But what do those processes look like inside the container?
$ docker run -it ubuntu /bin/sh
PID TTY TIME CMD
1 ? 00:00:00 sh
7 ? 00:00:00 ps
$ ps au | grep [b]in/sh
500000 452 0.0 0.0 4508 748 pts/0 Ss+ 18:48 0:00 /bin/sh
In the above example, the /bin/sh process is owned by root inside the container,
but it’s owned by the subordinate UID  outside of the container.
This is all. Feel free to comment.
Source: docker documentation