lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8d7b6d47-9001-1f47-bce8-e7fae28fafcf@linux.ibm.com>
Date:   Wed, 1 Dec 2021 16:34:35 -0500
From:   Stefan Berger <stefanb@...ux.ibm.com>
To:     jejb@...ux.ibm.com, linux-integrity@...r.kernel.org
Cc:     zohar@...ux.ibm.com, serge@...lyn.com,
        christian.brauner@...ntu.com, containers@...ts.linux.dev,
        dmitry.kasatkin@...il.com, ebiederm@...ssion.com,
        krzysztof.struczynski@...wei.com, roberto.sassu@...wei.com,
        mpeters@...hat.com, lhinds@...hat.com, lsturman@...hat.com,
        puiterwi@...hat.com, jamjoom@...ibm.com,
        linux-kernel@...r.kernel.org, paul@...l-moore.com, rgb@...hat.com,
        linux-security-module@...r.kernel.org, jmorris@...ei.org
Subject: Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace


On 12/1/21 16:11, James Bottomley wrote:
> On Wed, 2021-12-01 at 15:25 -0500, Stefan Berger wrote:
>> On 12/1/21 14:21, James Bottomley wrote:
>>> On Wed, 2021-12-01 at 13:11 -0500, Stefan Berger wrote:
>>>> On 12/1/21 12:56, James Bottomley wrote:
>>> [...]
>>>> I tried this with runc and a user namespace active mapping uid
>>>> 1000 on the host to uid 0 in the container. There I run into the
>>>> problem that  all of the files and directories without the above
>>>> work-around are mapped to 'nobody', just like all the files in
>>>> sysfs in this case are also mapped to nobody. This code resolved
>>>> the issue.
>>> So I applied your patches with the permission shift commented out
>>> and instrumented inode_alloc() to see where it might be failing and
>>> I actually find it all works as expected for me:
>>>
>>> ejb@...tdeb:~> unshare -r --user --mount --ima
>>> root@...tdeb:~# mount -t securityfs_ns none /sys/kernel/security
>>> root@...tdeb:~# ls -l /sys/kernel/security/ima/
>>> total 0
>>> -r--r----- 1 root root 0 Dec  1 19:11 ascii_runtime_measurements
>>> -r--r----- 1 root root 0 Dec  1 19:11 binary_runtime_measurements
>>> -rw------- 1 root root 0 Dec  1 19:11 policy
>>> -r--r----- 1 root root 0 Dec  1 19:11 runtime_measurements_count
>>> -r--r----- 1 root root 0 Dec  1 19:11 violations
>>>
>>> I think your problem is something to do with how runc is installing
>>> the uid/gid mappings.  If it's installing them after the
>>> security_ns inodes are created then they get the -1 value (because
>>> no mappings exist in s_user_ns).  I can even demonstrate this by
>>> forcing unshare to enter the IMA namespace before writing the
>>> mapping values and I'll see "nobody nogroup" above like you do.
>> I am surprised you get this mapping even after commenting the
>> permission adjustments... it doesn't work for me when I comment them
>> out:
>>
>> [stefanb@...-ns-dev rootfs]$ unshare -r --user --mount
>> [root@...-ns-dev rootfs]# mount -t securityfs_ns none
>> /sys/kernel/security/
>> [root@...-ns-dev rootfs]# cd /sys/kernel/security/ima/
>> [root@...-ns-dev ima]# ls -l
>> total 0
>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20 ascii_runtime_measurements
>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20
>> binary_runtime_measurements
>> -rw-------. 1 nobody nobody 0 Dec  1 15:20 policy
>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20 runtime_measurements_count
>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20 violations
>> [root@...-ns-dev ima]# cat /proc/self/uid_map
>>            0       1000          1
>> [root@...-ns-dev ima]# cat /proc/self/gid_map
>>            0       1000          1
>>
>> The initialization of securityfs and setup of files and directories
>> happens at the same time as the IMA namespace is created. At this
>> time there are no user mappings available, so that's why I need to
>> make the adjustments 'late'.
> There is one other possible difference:  To get the correct s_user_ns

I am currently wondering why I cannot re-create your setup while 
disabling the remapping...




> on the securityfs_ns mount, the mount namespace itself has to be owned
> by the user namespace ... is runc doing that correctly?  I always

Following an strace of 'runc create' I see an unshare(CLONE_NEWUSER) by 
a process before it does an 
unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID|CLONE_NEWNET), 
so this seems to be doing it in the order you suggest.

Also, runc seems to have its own set of struggles. I am not sure we 
would be able to ask them to accommodate us to do it 'correctly' - it 
doesn't sound so 'easy' for them either to get everything under the hood:

https://github.com/opencontainers/runc/blob/master/libcontainer/nsenter/nsexec.c#L919

      * In order for this unsharing code to be more extensible we need 
to split
      * up unshare(CLONE_NEWUSER) and clone() in various ways. The ideal 
case
      * would be if we did clone(CLONE_NEWUSER) and the other namespaces
      * separately, but because of SELinux issues we cannot really do 
that. But

[...]

      * However, if we unshare(2) the user namespace *before* we 
clone(2), then
      * all hell breaks loose.

sounds like fun

So, I am not quite sure whether I am working around an issue of runc but 
for that I would like to first be able to re-create your successful 
setup to see what's different.

    Stefan


> forget this detail because unshare does it correctly automatically but
> it means you must unshare the user namespace first and then unshare the
> mount namespace (or do it in the same sys call because the kernel will
> get the correct order).
>
> James
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ