lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 23 Dec 2017 21:12:14 -0600
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Alexey Dobriyan <adobriyan@...il.com>
Cc:     Dave Jones <davej@...emonkey.org.uk>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        Linux Kernel <linux-kernel@...r.kernel.org>,
        syzkaller-bugs@...glegroups.com, Gargi Sharma <gs051095@...il.com>,
        Oleg Nesterov <oleg@...hat.com>,
        Rik van Riel <riel@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [TEST PATCH] pid: fix allocating pid 2 for init (was Re: proc_flush_task oops)

Alexey Dobriyan <adobriyan@...il.com> writes:

> On Fri, Dec 22, 2017 at 08:41:54AM -0600, Eric W. Biederman wrote:
>> Alexey Dobriyan <adobriyan@...il.com> writes:
>
>> > unshare
>> > fork
>> >     alloc_pid in level 1 succeeds
>> >     alloc_pid in level 0 fails, ->idr_next is 2
>> > fork
>> >     alloc pid 2
>> > exit
>> >
>> > Reliable reproducer and fail injection patch attached
>> >
>> > I'd say proper fix is allocating pids in the opposite order
>> > so that failure in the last layer doesn't move IDR cursor
>> > in baby child pidns.
>> 
>> I agree with you about changing the order.  That will make
>> the code simpler and in the second loop actually conforming C code,
>> and fix the immediate problem.
>
> Something like that (barely tested)

I have thought about this some more and I think we can do better.

I don't like the on stack pid_ns array.

The only reason the code calls disable_pid_allocation is that we
don't handle this error.

The semantics of least surprise, are that if we run out of resources
while allocating something, trying again when more resources are
available will make it work.

So it looks like handling the error will improve the quality of the
implemenation, and be a simpler, less dangerous patch.

diff --git a/kernel/pid.c b/kernel/pid.c
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -226,6 +224,10 @@ struct pid *alloc_pid(struct pid_namespace *ns)
        while (++i <= ns->level)
                idr_remove(&ns->idr, (pid->numbers + i)->nr);
 
+       /* On failure to allocate the first pid, reset the state */
+       if (ns->pid_allocated == PIDNS_ADDING)
+               idr_set_cursor(&ns->idr, 0);
+
        spin_unlock_irq(&pidmap_lock);
 
        kmem_cache_free(ns->pid_cachep, pid);

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ